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Preface 


One morning about the 10 July 1925 | suddenly saw light: Helisenberg's 
symbolic manipulation was nothing but the matrix calculus well-known to 
me since my student days. 


MAX BORN, In My life, 1978 


These notes are based on lectures given to second- and third-year mathe- 
matics undergraduates at Oxford and aim to take advantage of the alge- 
braic background which British mathematics students have acquired by the 
time they have the opportunity to study quantum mechanics. As such this 
book differs from most of the introductory texts, which are aimed primar- 
ily at physicists and have to develop the necessary mathematics as they 
go along. They presume an acquaintance with the basic ideas of vector 
spaces and inner product spaces, which are to be found in most elementary 
textbooks on linear algebra. (Though the first appendix contains a brief 
summary of the main definitions and results which are needed.) On the 
other side, I thought it wiser not to demand too great a familiarity with 
special functions, since mathematicians are no longer trained to recognize 
the hypergeometric function lurking behind a long calculation and camou- 


’ flaged by unexpected changes of variable. Although not strictly necessary, 


it would be useful for the reader to have encountered the basic ideas of 
Hamiltonian mechanics, as found, for example, in Chapter 4 of N.M.J. 
Woodhouse’s Introduction to analytical dynamics, Oxford, 1987. 

There are, of course, dangers with this approach. Quantum mechanics 
(in contrast to quantum field theory) can be subjected to a perfectly rigor- 
ous mathematical treatment, but this requires a more extensive knowledge 
of mathematics than the average undergraduate is likely to have acquired 
by this stage, and would probably hinder rather than help understand- 
ing. However, the basic structure of the theory can be set out in terms 
of elementary linear algebra even if some of the detail is missing. I have 
therefore tried to give the flavour of the correct argument, noting where 
there are genuine difficulties, but trying to avoid lies. It is hoped that, 
besides placing quantum theory firmly in the context of the mathematics 
course, it will also give students a new perspective on algebra by providing 
them with a different range of examples. 

On the other hand, in a first course in quantum theory I feel that it 
is best to build up confidence with simple examples before launching into 
the mathematical structure. For that reason the selection of topics is fairly 
standard, covering most of the elementary properties, including relativis- 
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tic quantum mechanics, but little scattering theory and avoiding quantum 
field theory altogether. I have, however, indulged my own interests by in- 
cluding a brief introduction to the algebraic quantum theory in Chapter 
11. Chapters 9 and 19 on symmetry in quantum mechanics provide an 
opportunity to exploit the knowledge of group theory gleaned from algebra 
courses. For that reason the exposition concentrates on the groups, rather 
than the Lie algebras which appear in most physics texts. The material in 
these and other chapters and sections which are marked with an asterisk 
is not required for any of the other unmarked sections, and can be omitted 
on a first reading. I have included two approaches to perturbation theory, 
supplementing the usual Rayleigh—Schrédinger theory with an iterative ap- 
proach which is more closely linked to other approximation techniques. I 
have also included a brief and, no doubt, tendentious account of some of 
the paradoxes of quantum mechanics, since there is no doubt that most 
students are keen to link the technical theory to popular accounts which 
they may have read. 

The development of quantum theory was anything but straightforward. 
It took the best part of thirty years to obtain a reasonable theory of systems 
with a finite number of degrees of freedom, and the search for an equally 
consistent quantum theory of fields continues to this day. During the early 
years the founders of the theory were often discouraged; witness Einstein’s 
1912 remark that ‘the more successes the quantum theory has the sillier 
it looks’. Even as the new theory became clearer in the mid 1920s the 
physicists who were creating it chased down many blind alleys and poured 
scorn on ideas which later proved to be correct. To try to give a bit more feel 
for the historical perspective I have included at the head of each section a 
quotation from one of those involved in this revolution in our understanding 
of the laws of physics. Most of these date from the time when the new laws 
were being uncovered and their discoverers were either stil! confused or still 
elated at their progress, and I hope that they will help to give an impression 
of what went on. (I reluctantly omitted their pithy criticisms of each other, 
replete with earthy expletives and cries of ‘trash’ and ‘rubbish’.) 

The end of a proof is signalled by the symbol 0. Exercises marked with a 
° are based on questions asked in the Final Honour School of Mathematics 
at Oxford. 

The first part of this book appeared as Mathematical Institute Lecture 
Notes in 1988. I am grateful to several generations of students who read 
earlier versions of these notes, struggled with badly worded exercises and 
made helpful suggestions. I have included hints or partial solutions for 
those exercises which seemed to cause the most difficulties. 
Oxford K.C.H. 
May 1996 . 
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1 = Introduction 


Today | have made a discovery which js Just as Important as the discovery 
of Newton. 


MAX PLANCK to his son Erwin, Autumn 1900 


He [de Broglie] has lifted one corner of the great vell. 


ALBERT EINSTEIN, letter to Pau! Langevin, 1924 


In the first quarter of this century physicists were slowly and somewhat re- 
luctantly forced to realize that the view of the physical world, so painstak- 
ingly built up over the preceding four centuries, was in serious need of 
revision. In particular, the laws that governed the motion of particles on 
the atomic scale were not those of classical mechanics. Physics found itself 
catapulted into the strange new world of quantum theory in which the very 
notion of position or velocity of a particle became blurred and statistical 
laws dominated everything. 

With the benefit of another sixty years we can see that the gains far 
outweigh the extra difficulties of visualization that quantum theory intro- 
duced, for it made possible a real unification of physical theories that had 
previously been quite distinct. The same laws that governed the motion of 
a projectile or a planet also determined the shape-of molecules, the energy 
of chemical reactions and the optical and electronic properties of matter. 
The invention of the laser and the transistor were both inspired by quantum 
theory, and superconductors are also quantum mechanical devices. 

To appreciate why quantum theory was needed, it is helpful to trace 
its development from the closing years of the last century. At that time 
there was a growing confidence that the basic goals of physics were all but 
achieved. Newton’s laws of gravity and motion were now well established. 
The experimental discovery of radio waves had confirmed the predictions of 
Maxwell’s theory, which united electricity, magnetism and light. The next 
task was to apply statistical mechanics to explain how radiation interacted 
with matter and then the fundamental laws would all be known. Experi- 
ments aimed at improving the newly invented electric light bulb showed 
clearly that at any given temperature a hot body radiated a characteristic 
mix of light of different wavelengths, giving it a particular hue. By the 
spring of 1900 the experimental results began to diverge from the theoreti- 
cal prediction made by Willy Wien: classical physics was unable to explain 
the electric light. 

In the closing weeks of the last century Max Planck realized that this 
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predicament could be avoided if radiation were emitted only in packets 
or quanta. For, if the energy E of such a packet was proportional to its 
frequency w, 

EB = hw, (1.1) 


then there would be insufficient energy to make high frequency packets, 
and the colour of the radiation would be modified. (This is now known 
as Planck’s law.) By taking the constant of proportionality h = 1.0546 x 
10-34 Js it was possible to obtain excellent agreement with experiment. 
(The quantity h = 27 is known as Planck’s constant. We have quoted a 
modern measurement rather than the value known to Planck.) The micro- 
scopic value of # meant that the bunching into packets would not normally 
be perceptible; the quanta for sodium yellow street lights, for example, 
carry an energy of 3 x 10-!9J, and 100 watts would be enough power to 
eject 3 x 1079 such quanta every second. 

Planck seems to have thought that these packets of energy simply repre- 
sented the mode by which atoms were able to release energy. The process 
seemed analogous to the way in which a plucked string of a particular 
length emits a note of a certain pitch, so that it might give a clue to the 
structure of atoms. It was Albert Einstein who realized in 1905 that the 
quanta were not a feature of atoms but of light itself, and he showed how 
this would explain another puzzling phenomenon, the photoelectric effect. 

Solar panels have familiarized us all with the way in which light striking 
certain materials can eject electrons and cause a current to flow. The 
puzzle was that increasing the frequency of the light increased the energy 
of the electrons but not their number (below a certain frequency none were 
emitted at all), whilst increasing the intensity of the light affected the 
numbers but not the energy. Einstein pointed out that if one thought of 
the light as a stream of quanta or photons whose energy was proportional 
to their frequency then at low frequency no photon would have enough 
energy to eject an electron at all, but as the frequency and energy of each 
photon increased, so it could transfer more energy to an electron. Since 
each photon would eject a single electron one could see why the numbers 
of electrons would not increase. Changing the intensity of the light, on the 
other hand, increased the number of photons available without affecting 
the energy of each individual one, and so ejected more electrons of the 
same energy. 

Einstein’s suggestion was revolutionary, for it flew in the face of almost 
a century of experimentation that seemed to show conclusively that light 
consists of waves and not particles. Now it seemed that one had to consider 
it as both. For that reason the physical reality of photons was not widely 
accepted until new experimental data on the scattering of photons by elec- 
trons became available in the 1920s. Nonetheless, further confirmation of 
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EXERCISES 3 


the link between matter and radiation came in 1913 when Niels Bohr com- 
bined the idea of quanta with Rutherford’s nuclear model of the atom to 
provide an explanation of atomic spectra. 

Retrospectively it may seem strange that Einstein did not take the next 
step of suggesting that if waves could mimic particles then perhaps particles 
could also behave like waves. This was proposed by Louis de Broglie in 
1923-4, and he showed that Einstein’s theory of relativity provided a unique 
consistent procedure for associating a wave to a particle. De Broglie’s 
idea was soon developed into a far wider and more powerful theory by 
Schrédinger, Heisenberg, Born, Dirac and others. 

The reality of de Broglie’s particle waves was soon demonstrated by 
Davisson and Germer and by G.P. Thomson who showed that electrons 
passing through thin foil produce the same kind of interference and diffrac- 
tion patterns as light. (In fact physicists had been observing such inter- 
ference patterns for years without realizing their significance.) There are 
now many more sophisticated demonstrations of the same phenomenon us- 
ing such devices as the neutron interferometer (a carefully crafted crystal 
that splits a neutron beam into two and then recombines them to produce 
interference effects). 

Richard Feynman once remarked that all quantum effects are ultimately 
a consequence of the wave nature of matter. This is therefore an appropri- 
ate point at which to begin the mathematical discussion. 


Exercises 


1.1 Calculate the energy of a single photon of the following kinds of elec- 
tromagnetic radiation: 


(i) a radio wave of frequency 200 kHz (= 2 x 105 Hz), 
(ii) red light of frequency 4.95 x 1014 Hz, 
(iii) X-rays of frequency 102° Hz, 
(iv) gamma radiation of frequency 10? Hz. 


[Note that in Planck’s formula w is an angular frequency so 1Hz 
corresponds to w = 27 radians per second.] 


1.2 A radio station broadcasts on a frequency of 200 kHz. Estimate the 
number of photons striking a 1 square metre aerial each second at a 
distance of 1000 km from a 200kW transmitter. 
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How would your answer change if the aerial were on a space 
probe at a distance of 3000 million km from the earth? 
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2 Wave mechanics 


At the moment | am struggling with a new atomic theory. If only | knew 
more mathematics! 


ERWIN SCHRODINGER, letter to Willy Wien, 27 December 1925 


2.1. Schrédinger’s equation 


According to Planck and Einstein the energy and frequency of light are 
related by E = hw. De Broglie observed that to be consistent with the 
theory of relativity the momentum p of a wave should be Ak where k is the 
wave vector of magnitude 27 /wavelength, perpendicular to the wave front. 
(This is because the special theory of relativity combines HE and p and 
also w and k into four-dimensional vectors, and linear relations between 
particular components force the same relations for all other components as 
well. This is explained in more detail in Section 17.1, which is independent 
of the rest of the book.) 

In the case of light the fixed propagation speed c leads to the relations 
w = c|k| and EF = clp|, so that |p| = Ajk| follows directly from Planck’s 
law. Thus for photons de Broglie’s relationship adds nothing new, and it 
had indeed already been introduced by Johannes Stark in 1909. However, 
de Broglie’s other crucial idea was that these relationships might apply to 
other elementary particles such as the electron as well. 

De Broglie was able to use his new relationship together with geometrical 
arguments to solve some simple problems, but further advances had to wait 
until Erwin Schrédinger had discovered the wave equation satisfied by de 
Broglie’s ‘matter wave’. Consider first a plane wave (t,x) = exp[—i(wt — 
k-x)]. By direct differentiation we obtain 


i = a eee oF (2.1) 
and h 
Vw = hk = py. (2.2) 


The normal relations between energy and momentum then provide a dif- 
ferential equation for ~. For example, a particle of mass m in a potential 
V(x) has energy E = |p|?/2m + V(x), so that 


ob (Ipe __ op 
inSe = By = (BE +Ve))v= aay Ute. (2.3) 
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Unfortunately this is not consistent, because the solutions of this equation 
are not usually plane waves. Schrédinger therefore suggested that one 
should regard the partial differential equation 


° ee ee ETS, 
(hZ = 5 Vt Vo (2.4) 


as more basic and simply use its solutions, ~{t,x), to describe the waves. 


is known as Schrédinger’s equation. 


The equation can easily be modified for particles moving in one or two 
dimensions by substituting the appropriate form of V?. 

One natural approach to finding solutions of this equation is to look 
first for separable solutions that are products of. functions T(t) and U(x). 
Substituting x = TV into Schrédinger’s equation and dividing by TW to 
separate the variables then gives 


dT i 
ih—/T = ( -—_V? yy) /v. 2.5 
tha! (-Eve+v ys (2.5) 
Since the two sides of this equation depend on different sets of variables, 
each must be a constant, and it will turn out to be consistent with previous 


notation to call this constant EH. This gives us two equations 


. aT 
ih = ET (2.6) 
and 
hice 
-—— V0 + VU = BY. (2.7) 
2m 
The first equation for T can immediately be integrated to give 
T(t) = e*#*/"7(0) (2.8) 
so that : 
b(t, x) =e P/P4(0, x). (2.9) 
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Multiplying the first equation by Y and the second by T casts these into 
another useful form 


OW! Oe 
ia, = Eb= —5 


2 
7 Vp + Vy. (2.10) 


We have thus recovered the relation between the energy and the wave 
function postulated by Schrédinger, and so retrospectively justified the use 
of the notation F for the constant of separation. 


Definition 2.1.2, The equation 


A? 
~ 2m 


Vv +tVy = Ev 


that determines the spatial form of the wave function is also known 
as Schrédinger’s equation or, more precisely, as Schrédinger’s time- 
independent equation. 


Remark 2.1.1. Both of Schrédinger’s equations are linear; that is, when 
a, and we are solutions so is c; #1 +c2%2, for any complex numbers c, and 
ce. In more algebraic language this says that the solutions form a complex 
vector space, a fact that will 6ften be useful in the examples that follow. 


2.2. The square well 


As a first example let us consider a particle moving within the interval 
{0, a} on the x-axis under the influence of the potential V(x) = 0. The zero 
potential allows it to move freely within the interval. 
Schrédinger’s time-independent equation for a particle moving in one 
dimension with energy FE is 
—h? dy 
Om da? +Vy= Ey, (2.11) 


and to avoid problems at the endpoints we make w vanish there. This 
leaves us with the differential equation ; 


A? dap 
— an ar = BY (2.12) 
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for x in the interval (0,a), and the boundary conditions 


(0) = 0 = (a). (2.13) 


This equation is the same as that which arises when we look for the 
normal modes of oscillation of a string fixed at the points 0 and a, and 
we can solve it in the same way. The general solution of the differential 
equation takes the form 


Acosh(,/2m|E|2/h) + Bsinh(,/2m|E|a2/h), if B <0; 
W(x) = 4 A+ Bz, if E=0; (2.14) 
Acos(V2mE2z/h) + Bsin(V2mE2/h), if E > 0; 


where A and B are constants. 

In each of the three cases the condition that (0) = 0 forces A to vanish. 
When FE < 0 the other boundary condition, (a) = 0, also forces B to 
vanish, and leaves only the trivial solution ~ = 0. However, for positive E 
the boundary condition at a can also be satisfied if V2mE/h = n/a for 
some integer n, giving 


w(x) = va(z) = Bsin(nwe/a). (2.15) 


For n = 0 we again get the trivial solution 7) = 0 and the solutions for n 
_ and —n are essentially the same (since a sign can always be absorbed into 
the constant B), so we may as well take n to be a positive integer. 
The corresponding energy is given by 
222 


E=En= 3° (2.16) 


This reveals one of the remarkable features of quantum mechanics: the 
energy of the system can take only certain discrete values. 


Definition 2.2.1. When the possible energies of a quantum system ° 
are discrete and bounded below then the lowest possible energy is 
called the ground state energy. The higher energies are known in in- 
creasing order as the first excited state energy, the second excited state 


energy, and so on. 

A wave function corresponding to the ground state is called a 
ground state wave function, and a wave function corresponding to the 
k-th excited state energy is called a k-th excited state wave function. 
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FIGURE 2.1. The three lowest energy wave functions for a particle in a finite box 
(n = 1, 2, and 3). 


Example 2.2.1. The ground state energy in the infinite square well po- 
tential is £,, and the k-th excited state energy is Fx41. Oo 


The constant B that appears in 7 is arbitrary, but it is customary, for 
reasons that will become apparent later, to choose it so that 


[iwerrar =1. (2.17) 


The wave function is then said to be normalized. In this case the normal- 
ization condition amounts to 
x 1 
——dz = =alB/? 2.1 
we dx = Sal, (2.18) 


a 
l= ee f sin? — 
0 


Vn(x) = V2 sin = (2.19) 


is an appropriate wave function (see the graphs in Figure 2.1). 
Combining this spatial form with the time development for a separable 
solution we arrive at 


2 —iEat/h NTL 
= _- am sin —— 
valtsa) = of Zettel ain 
= pew! 2ma? sin _ (2.20) 


Although this example may seem rather unrealistic, it is now possible to 
make quantum dots which are two-dimensional analogues of square wells, 
and to check the predictions directly. D. Eigler and collaborators working 
in an IBM laboratory have constructed a circular palisade of iron atoms, 
14 nanometres in diameter, on a copper surface and checked the form of 
standing waves within the circle using a scanning tunnelling electron mi- 
croscope. A photograph of the waves appears on the cover of the November 
1993 issue of the magazine Physics Today. 


so that 
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2.3. The time evolution of the wave function 


There is no reason to expect that a general wave function ~ will be separ- 
able, but as in other similar problems one can try to expand it as an infinite 
sum of separable solutions. We therefore look for coefficients c, such that 


co 
V(t.) = >> eabn(t, 2) 
n=1 
2 2,4 2 NN 
= [2d vee w'ht/ama" sin a (2.21) 
n=1 


If we know the initial wave function, ~(0,2) = f(x), then we require 


that 
[2 = nr 
=f in-——, 2.22 
f(z) - > Cn sin = (2.22) 


‘We know from the theory of Fourier series that such an expansion is possible 
for suitable functions f (in fact it is sufficient for f to be normalized), and 
that the coefficients are given by 


| =cn = = ic sin "de, (2.23) 


that is 
n= 2 [ f(a) sin “de = | f (2) Wn(x)dz. (2.24) 


(We shall see later that this last simplification is no accident.) It is easy 
to check that when the Fourier series of f” obtained by differentiating 
twice is uniformly convergent, so too is the series for 074(t,x)/Oz”, and 
the resulting function #(t,r) does satisfy the general form of Schrédinger’s 
equation. 


2.4. The interpretation of the wave function 


The usefulness of the wave equation in determining the quantized energy 
levels of the system makes it natural to ask what the wave function ~ means 
physically. Max Born, modifying suggestions of de Broglie and Schrédinger, 
came up with what is now the accepted interpretation: 
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2/a 2/a 2/a 
0 0 0 
0 a 0 a 0 a 


FIGURE 2.2. The probability densities for the three lowest energy wave functions 
in a square well. For large values of n the graph resembles a typical interference 
pattern. 


Assumption 2.4.1. Quantum mechanical particles are governed by 
statistical laws and for normalized wave functions, w, 


gives the probability density for the position of the particle. 


Thus the probability that the particle described by (x) is in the subset 
S of R is 


I Iw(2)? ae, (2.25) 


with similar expressions for two- and three-dimensional systems. Since 
\(z)|? is non-negative and we have agreed to normalize it so that 


/ Ib(a)|? dae = 1, (2.26) 
R 


this assumption is consistent with the elementary requirements of proba- 
bility theory. 
This interpretation makes good sense for the square well. In particular, 


since #(x) vanishes outside the well, the particle is almost sure to be found . 


inside the well. Within the well the probability density is 


2 .9qnnmz 1 2ncz 
fa(z) = gon = (1 ~ cos — : (2.27) 


Figure 2.2 shows the graphs of f, for m = 1,2 and 3. 
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The distribution function can be derived by integrating the density 


(2.28) 


Compared with the classical distribution F,(z) = z/a for a uniform dis- 
tribution in (0,a] the quantum distribution oscillates, and this is another 
typical feature of quantum mechanics arising from interference effects for 
the wave function. 


Example 2.4.1. To see how all this works in practice let us calculate 
the probability that the quantum mechanical particle confined in the box 
[0,a] is actually within ja of the centre of the box; in other words the 
probability that it is within [4a, 3a]. 


Solution. When the wave function is 7, the probability is 
go - 

a Ibn (2)|? dar = Fy(3a/4) — Fn (a/4) 
a * 


4 
(31 (,, 8m) _ 1 _ 1. (sin 3) 
= 127 One (8979 4 Qnn Q/\" 


Since sin 3n7/2 = —sinn7/2, this gives for the probability 


if n is even 
£4 (-1)2@-)/na ifn is odd. 


Nir 


1 + sin St = { 
rer a 


It is also easy to work out the mean and variance for the particle’s 
position. The mean is 


x? Qn7zx 
=a- E + trent cos l, 
1 
=> Dh 


This result is also easily visible from the symmetry of the distribution about 
its midpoint. 
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The second moment is 


[ x? fa(x) dz = [em | —2 a 2F, (x) dx 


0 
a 7/2 
=a?—2 [ (= - 2 sin) dz 
0 a nH a 
2 
—,2_ 4,2 
=a" — 3a 
1 _ ta, 52mm [ 7 a oe Qn7rz de 
ni 2nr a@ jo Jo 2n7 a 
1 2 
meg? aes, 
3 2n2n2 


From this we may deduce that the variance of the distribution is 
1, @ 1 \? 1 1 ‘ 
was ——— — — _- oo s Oo 
3° ~ On2n? (52) 12 Intn? )® 


Remark 2.4.1. It will be noted that for large values of n the quantum 
mechanical variance approximates ever more closely to the variance je" 
of the classical uniform distribution. The tendency of quantum mechanical 
formulae to approach those of classical theory when the quantum number 


is large has been called by Bohr the correspondence principle. 


2.5. Currents and probability conservation 


So far our discussion has glossed over one important point that could jeop- 
ardize the whole statistical interpretation of the wave function: is proba- 
bility conserved? That is, if we arrange for 


i h(0, x) |? d?x = 1 (2.29) 
RS 
and then let # evolve according to Schrédinger’s equation, will 
7 I(t, x)? dx = 1 (2.30) 
RS 


for all values of ¢t, or are particles merely evanescent with a fluctuating 
probability of existence? 
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For stationary states it is clear that the probability must be conserved, 
as the above example illustrates, since 


Ibn (t,x)? = e~*P>*/Peh,(0, x)]? = [hn (0, x)”. (2.31) 


In general, for any wave function ~w satisfying Schrdédinger’s equation in 
a real potential V, 


a 6—— (e) 
riba x)|? = ay ets x) ) P(e, x) + w(t, x5 Ot, x) 


if We sl af OW ce 
= a(-ga%¥+ 0) [9+ 7-5 (337+ ¥9)| 


ee ee er Neh Re os | 
= i(-3RY B+V0)e-B[F(-Z-V v+Ve) 


= GV*y ~ vv") 


= a iv Derady — pgrady). (2.32) 


Definition 2.5.1. The probability current, j, is defined by 


i(t,x) = 5" Grady — verady). 


Our previous calculation can now be summarized as a theorem. 


Proposition 2.5.1. The probability density and probability current 
satisfy the continuity equation 


op 
ot 


+ divj =0. 


Like its analogues in other areas of applied mathematics this equation 
leads directly to a conservation law for probability. 


vine eth net gee Renta rattan ease Che SRNL OUR HASSE AE MIME CHL RR ATE A aE URS MR AALS A CERI OCH PRA AER UCR OURED MEIE DEL AD LAME ADIL ao Sa 2 we acL A Loa eT EA 2 
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Proposition 2.5.2. Suppose that for all t the probability current 
j(t,x) tends to 0 faster than |x|~? as |x| > oo. Then 


/ p(t, x) d?x 
R3 


is independent of ¢. 


Proof. If D is the volume enclosed by a surface S then for suitably well- 
behaved functions w 


a Be Bax = f (divi) atx=- fj 
5: [4 x= f atx = A divj) d°x = eee: (2.33) 


By considering a sphere S of radius R we see that if j tends to 0 faster than 
1/R? for large R then this last surface integral tends to 0 as R — oo, and 
so 


a 3 

pa = 0. 2. 

a iA pdx =0 (2.34) 
Thence, 

y p(t, x) d°x (2.35) 
R3 

is independent of t. (In particular, if it is 1 when ¢t = 0, then it is 1 for all 
values of t.) a 


When considering the meaning of the probability current it is worth 
noting that 


j(t,x) = 57 Derady) — perady) = =Re(Tyeradw). (2.36) 


According to Schrédinger the vector operator P = —ihV applied to a wave 
function w gives its momentum, so that P/m can be thought of as its 
velocity. Thus j looks like the density times the velocity of the wave, which 
is similar to the way in which the electric current is defined as the charge 
density times the velocity. 

This becomes particularly clear if p is a plane wave exp(ik.x). Although 
it is impossible to normalize the wave function since p(x) = |wW(x)|? = 1, 
we nonetheless have 


j(x) = —Re (e“**Ake™*) = wk. (2.37) 


PERE CROMPTON TRY STIS OTERO RUTTER YUAN EL EG" BITTER RT PARROT NERY OBA VUITTON 
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In other words, the probability current is constant and equal to the mo- 
mentum of the wave divided by its mass. 

Generally only normalized wave functions are ascribed any direct physi- 
cal significance, but plane waves are a useful approximation when one has a 
beam of particles whose energy and momentum are known very accurately. 
We shall consider examples of that in Chapter 5. 


Definition 2.5.2. To distinguish their different physical roles we call 
the normalizable wave functions bound states of the system, and the 


wave functions that cannot be normalized, scattering states. 


2.6. The statistical distribution of energy 


It is not only the position of the particle which is subject to statistical laws. 
In Section 2.3 we saw that a general wave function for the square well can 
be expanded in terms of separable wave functions: 


w(t, x) = Do cnalt, 2) = [2 dvevsin ane (2.38) 


In this situation Oy/dt is no longer of the form Ey so that the energy ts 
no longer unambiguously defined. Heisenberg and Born therefore proposed 
the following: 


Assumption 2.6.1. The probability of measuring the value Ey, is 


len|?. 


Remark 2.6.1. Parseval’s theorem for Fourier series gives 


| [2 
Vem 
a 
so, since # is normalized, we have 


n=1 
> leal? = 1. (2.40) 


This shows both that it is a reasonable definition, and that there is a zero 
probability of finding an energy other than one of the Ey. | 


2 


ae ges 2.39 
=f wen) Pde (2.39) 
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Example 2.6.1. As an example, suppose that the wave function 7 inside 
the square well is just the constant 1/,/a (the value being chosen to ensure 
that w is normalized). Then 


The probability of measuring the energy E, is therefore zero if n is even, 
and is given by 


if n is odd. oO 


2.7. Historical notes 


Schrédinger’s attention was apparently drawn to de Broglie’s thesis by 
Einstein and he gave a colloquium on it in Ziirich towards the end of 1925. 
According to Felix Bloch at the end of Schrédinger’s lecture his colleague 
Peter Debye remarked that the proper way to deal with waves was by using 
wave equations. Schrédinger soon found an equation, but was discouraged 
when it predicted the wrong spectral lines for the hydrogen atom. Over 
Christmas he went up to the mountain resort of Arosa with a girl friend. 
Whilst there he returned to the equation and noticed that if one used 
the classical relation between the energy and momentum (as we did in 
Section 2.1), rather than its relativistic analogue, then the hydrogen atom 
spectrum came out correctly, as did many other problems. (This was a 
lucky accident. When in 1928 Paul Dirac discovered the correct relativistic 
equation it turned out that the relativistic effects that Schrédinger had 
suppressed are exactly cancelled by the spin angular momeritum of the 
electron that he had also neglected.) 

The previous June, whilst recuperating from a severe attack of hay fever 
on the island of Helgoland, Werner Heisenberg had already come up with 
the crucial idea that he, Max Born, Pascual Jordan, and.Dirac had de- 
veloped into an algebraic quantum theory. By the beginning of November 
Wolfgang Pauli, overcoming his initial distaste for abstract algebra, had 
already shown that this approach gave the correct spectral lines for the 
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hydrogen atom. None of these was initially impressed by Schrédinger’s 
work. (‘I quite agree with your criticism of Schrédinger’s ... wave theory 
of matter’, Heisenberg wrote to Dirac on 26 May 1926, ‘this theory must be 
inconsistent’.) However, before long Schrédinger and then Pauli had shown 
that the two apparently quite different theories using wave equations and 
algebra are actually equivalent. 


Exercises 


2.1 A sodium atom with a mass of 3.82 x 10° kg emits a photon with 
a, wavelength of 5.89 x 10-7 m. Find the recoil velocity of the atom. 
[Recall that A = 1.0546 x 10-*4 Js. Otto Frisch measured this recoil 
in 1933,] 


2.2 A particle of mass m moves in the interval {0,a] under the influence 
of the constant potential V(x) = Vo. Show that the possible energies 
of the system are 

Qq2p 2 
nén*h 
E = Vo + br, Se 
ama 


2.3° A particle of mass m moves in the rectangle (0, a] x [0,6] in the zy- 
plane under the influence of a zero potential. The wave function 
vanishes at the boundaries of the region. By separating the variables 
x and y in Schrédinger’s equation show that the permitted energies 


of the system are 
nh? cig k? 
En T(t) 


where j and & are positive integers. 

In the case when a = b find two normalized wave functions cor- 
responding to the energy 57?h?/2ma?. Find the probability in each 
case that the particle lies in the region {(z,y) € R*?: 2 < y}. 


2.4 A particle of mass m moves within a ball of radius a in R? under the 
influence of the potential V(r) = 0. Show that there are continuous 
wave functions of the form 7(r) (independent of angles) that satisfy 
Schrédinger’s equation with energy 


nen h? 


2ma? 


nm = 


for n = 1,2,.... What is the probability of finding the particle within 
a distance 4a of the centre? 
[You may assume that the wave function vanishes on the boundary.] 
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2.5 Show that the mean position of the particle in the previous question 
is at the origin. Find the variance of its height above the centre. 


2.6° A particle of mass m moves freely within the interval [0,a] on the 
z-axis. Initially the wave function is 


q8in(=) f + 200s(%2)). 


Show that at a later time ¢ the wave function is 


1 -i(n?nt/2ma?) sin (=) + 9¢73t(a?ht/2ma*) cos (**)| ; 
a a 


Ja 


Hence, or otherwise, find the probability that at time ¢ the particle 
lies within the interval [0, $a]. 

2.7° A particle, moving freely between impenetrable barriers at x = 0 and 
x = @, is in its lowest energy stationary state when the barrier at 
x = a is suddenly displaced to x = 2a. By expanding the original 
wave function in terms of separable solutions for the motion within 


{0, 2a],. find the wave function at a subsequent time t, and show that 
it is a superposition of states of energies 


nen h? 
8ma? ’ 


for n = 2 and n = 1,3,5,.... Show that the probability of finding the 
particle energy unchanged is 3. 


En = 


2.8 Let p(t, x) = |v(t, z)|? be the probability density for a particle moving 
on the z-axis, and 


; LU ck ap 
j(t,2) = 5 (sz me v5) 
the probability current. Show that 


80 , 8 _ 4 
at ax 

Show further that 7 vanishes identically if and only if there exists a 

function A(t), such that A(¢)P(t, xz) takes only real values. 


2.9 Show that the probability current for a wave function of the form 


(tx) = > Ase™*, 


Geittliienussauinmaatmsainatbanieeataadtematabneicameaaccueatatchrsrensaaantieee teamed meemamnat teeaatieaaeeaameecadenmereatamaticadmen uments meneiueaadmsaa aeeeaaaeaaeuinatammaanasdenidagaaasarmaesammadeneaatenaeaedaemce manat aamtimaa 
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with k, € R°, can be expressed as 
: hi = i(ka ky). 
j(t, x) = pe ree : 


Deduce that the probability current associated to the wave function 
w(t,x) = Aexp(tk-x) + Bexp(—ik-x) is 


i(t,x) = © (AP - |B) k. 


3 Quadratic and linear potentials 


If anything like mechanics were true then one would never understand the 
existence of atoms. Evidently there exists another ‘quantum mechanics’. 


WERNER HEISENBERG, letter to Wolfgang Pauli, 21 June 1925 


3.1. The harmonic oscillator 


Although the classical examples of harmonic oscillators such as springs or 
the simple pendulum are of limited use in the submicroscopic world of 
quantum mechanics, there are plenty of other applications that make the 
oscillator worth studying. Atoms in a crystal, for instance, oscillate about 
their mean positions and it is their vibrations that carry sound and heat 
through the material. Indeed, we know from classical theory that by using 
normal coordinates we may regard any system performing small oscillations 
about a point of stable equilibrium as a collection of harmonic oscillators, 
This extends even to the normal modes of vibration of a string or of the 
electromagnetic field. The harmonic oscillator thus provides the key to 
understanding how wave phenomena such as light can be quantized, and 
lies at the heart of Einstein’s idea of the photon and of Planck’s quantum. 


Proposition 3.1.1. The permissible energy levels of the harmonic 
oscillator with potential energy V(x) = 4mw*a? and Schrédinger 
equation 

hd’ 1 92 

a an ee =f 

2m da? a ee ‘3 » 


form the sequence 


By = (1+ 5) hw 


for N = 0,1,.... The corresponding wave functions take the form 


vwte) = (Fie) aw ([Be) mere 


where Hy is a polynomial of degree N. 
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Proof. This is not the sort of equation whose solutions immediately 
spring to mind, so we shall adopt the following strategy. We start by con- 
sidering the behaviour when |z| is large, and looking for an approximate 
solution ¢. This guides us to a substitution ~ = f - ¢, and a differential 
equation for f. Since ¢ already takes care of the long range behaviour 
of the wave function, we are interested in the short range behaviour of f, 
which suggests the use of a series solution. 

At large distances E is small by comparison with the potential energy, 
50 


2m dx? 
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The formula 


wm (MEY y (3.2) 


suggests that we might start by considering the first-order differential equa- 
tions, 


gat (7) $, (3.3) 


whose solutions are the functions ¢4, = exp(-+mwz/2h). Since the oscil- 
lator potential represents a strong attractive force, we would expect only a 
low probability of finding the particle far from the origin, so it is the func- 
tion ¢ = ¢~ which is of real interest. (In any case $4 is not normalizable.) 
We can now check by direct differentiation that 


rg = shud, (3.4) 


that is Schrédinger’s equation with E = hw. For large z, where the 
potential energy dominates EF, this is almost the same as the equation for 
wp, so we expect that for any energy yp ~ ¢. 

We shall therefore try a solution of the form 7 = f-¢. Substituting this 
into ere equation, we obtain 


re + 2f'g' + fo") + 5mu) °27 fo = Ef¢. (3.5) 
Recalling that Fi = —(mwz/h)¢ and ¢” = [(mwa/h)* ~ (mw/h)]¢, this 
reduces to omE 
7 MWE\ oy 
pr~2(2™) p Sep =-= FF, (3.6) 
or even more simply to 
ft 2 (TE) pg Pr (= ~5) #0. (3.7) 
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We can simplify the notation by setting N = E/hw — i, and changing the 
variable to € = z,/mw/h. With these changes, the equation reduces to 


d? f af 
de ab ae ENTS (3.8) 
We now try a series solution of the form 
foe} 
gta eee Sa ee aa (3.9) 
n=0 


with ap 4 0. Substitution gives 


foe} foe] foe] 
Sin t+c)(n+c—l)anente-? — 2 Si(n + c)ané” te + 2N S> anE"t? = 0, 

n=0 n=0 n=0 
(3.10) 

The indicial equation coming from the coefficient of €°~? is 
c(c— 1) =0, (3.11) 

so that c=0 or 1. The coefficient of €°—! gives 

(c+ 1)cai = 0. (3.12) 


If c = 1 this forces a, to vanish, but if c = 0 there is no constraint on 
a,. However, by subtracting a suitable multiple of the c = 1 series, we 
can ensure that the coefficient of € vanishes anyway, and so without loss of 
generality we take a, = 0. 

Comparing coefficients of £°+°—? we arrive at the recurrence relation 


(n+c)(n+e—l)an = 2(n+c-—2—-—N)an_2 (3.13) 
for n > 2. By induction we see that a, = 0 for all odd n, whilst 


a4n+e—-2—N) 


ce (nt+tc)(n+e- 7B ac 


(3.14) 
determines the even coefficient in terms of ap. 

Unless the coefficients vanish for n > N + 2—c they all have the same 
sign, and we may as well assume that they are all positive. We need only 
consider the case of even n, and for definiteness we take c = 0 (the case of 
c= 1 being similar). Then we can exploit some cancellations to obtain 


[n/2}!an = (1 — i) [(n — 2)/2}! ano. (3.15) 
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For n > 3N +4 the factor 1 — (N + 1)/(n — 1) lies in the interval (2,1), 
and so, for n > 2(j7 +1) > 3N + 4, we have iteratively 


[F]!2 Se ee la > > 2 
ee Fg Op ee 8 


Introducing K = j! (3/2)?a9;, for fixed j, we arrive at the inequality 


(n=2j)/2 
) jlag;. (3.16) 


n/2 
Qn > tia (3) , (3.17) 


and similar inequalities can be found when c = 1. Summing this over even 
n = 2m, and introducing the polynomial 


| _& Ke FON 
P(g) = x & a) () aa (3.18) 
we obtain 
FO = Do am > () 2m 4 P(g) = Kel’ + P(6). (8.19) 
m=0 m=0 


Thus f(€) exp(—3€?) grows faster than exp(€7/6) for large €, and we are 
back to precisely the sort of exponentially growing wave function that we 
were trying to avoid. We can only escape from this dilemma if the coeffi- 
cients vanish for large n, so that the series actually terminates and f is a 
polynomial. 

This happens if for some even n > 2 we have 


N=n+c-—2, (3.20) 


since then a, and all successive coefficients vanish. Bearing in mind the 
fact that c can be either 0 or 1 the above condition just means that N 
must be a non-negative integer. Returning to the definition of N we are 
now ready to deduce that the energy levels have the form stated in the 
proposition. The fact that these are the only possible values for the energy 
follows from the fact that N must be a non-negative integer. Conversely, 
for each of these energies the series for f terminates and provides a solution 
wy for the wave function. 

Explicitly, if N = 0, so that Ey = 3fw and c = 0, we already have 
a2 = 0, so that f is just a constant, C, and 


ho = Cem men /2h (3.21) 


alicia lia Licence lads | svelte cae sheen aia tadLdcden eh earnest anemone nde eetaidoee 
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n=0 
1 
n=l 
1 
n=2 
1 


FIGURE 3.1. The wave functions (left) and probability densities (right) for the 
ground state and first two excited states of the harmonic oscillator. The marks on 
the horizontal scale show the extreme limits of the motion of a classical harmonic 
oscillator of the same energy. 
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(This is, of course, essentially ¢, which we already knew to be a solution 
of Schrédinger’s equation when E = $fiw.) Since 

IWol? = [Clem (3.22) 
comparison with the normal probability distribution of variance i/2mw 
shows that we should take 
mu 
Th” 
From this the normalized ground state wave function is immediately seen 


to be 


When N = 1 we have E, = 3 hus and we must take c = 1. Again az = 0, 
and this time f(£) is a multiple of €° = €. We therefore have for the first 
excited state of the harmonic oscillator 


v1 = C,ze7™* 7/2h 


where C is a constant. In general the wave function takes the form 


ms Tw —mwa?/2h 
wn = Cnn (V oh a) € . 


where Hy is a polynomial and Cy is another constant. With appropriate 
normalization Hy is known as the N-th Hermite polynomial. The prop- 
erties of these polynomials will be discussed in more detail in Section 7.7. 
The graphs of the first three wave functions are shown in Figure 3.1. QO 


ic? = (3.23) 


(3.24) 


(3.25) 


(3.26) 


One of the most important features of this solution is that the quan- 
tum oscillator still has strictly positive energy even in its ground state: 
quentum oscillators can never just sit inert at the point of equilibrium like 
their classical counterparts. There are other differences from the classi- 
cal oscillator too, such as the possibility of finding the quantum particle 
in the non-classical region that the classical particle lacks the energy to 
reach. For example, the non-classical region for the ground state is where 
V > E = $hw, that is }mw*x? > dhw, or |€| > 1. The probability of 
finding the particle there is 


e~© dé = 0.157 (3.27) 


1 

Vm Siel>1 
Alternatively, one can argue that, since the probability density is that of 
a normal distribution with variance i/2mw, the probability of finding the 
particle in the non-classical region is the probability of being further than 
V2 standard deviations from the mean in a normal distribution, which is 
0.157. The probability of being i in the non-classical region decreases, albeit 
slowly, for larger values of n. 


f 
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3.2. Harmonic oscillators in higher dimensions 


Higher-dimensional oscillators can easily be handled by an appropriate sep- 
aration of variables. For Distance; Constdet a two-dimensional oscillator 
whose potential energy is V = 3m (w2a? + w7y?). 


Proposition 3.2.1. The energy levels of the two-dimensional har- 
monic oscillator whose Schrédinger equation is 


(2 ary 


aut + By ar) +5 =m (w2a? +w7y*) p = Ey 


have the form 


B= (M45 5) tan + (Na+ 5 5) hu 


for Ni, No = 0,1,.... 
written as 


we) = (2)? (enue)? tty, (242) tw, ( 


x exp (-= (wy a? + way?) ) : 


The corresponding wave functions may be 


Proof. Substituting a separable solution y(x, y) = X(x)Y(y), we obtain 
h2 x" y" 1 
om (et) tym ole? + wey") = E, (3.28) 
so that : ‘ 
Ae x" 4 Rey 3 Vas Bu 
3 + amwre = = B+ ony gy: (3.29) 


Since the two variables have now been separated, each side of the equation 
must be a constant, which we shall call E 1. This gives 


nh? 1 
——X" 4+ mw?2?X = EX, 


on 5 (3.30) 


jie a La ci i ie asa maehnsees fincas nestatnd aeamailee eh aah 
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and also, setting Ep = E— Fi, 
(3.31) 


Thus X and Y both satisfy a one-dimensional oscillator equation, and their 
respective energy levels are therefore 

Bj = (Nj + 4) hw; (3.32) 
where j can be 1 or 2 and both Nj and No are negative integers. The total 
energy is E = FE, + Ep giving 


E = (N, + 4) fur + (No + 3) fw. (3.33) 


Similarly, the wave functions take the form 


(zy) = X(z)¥ (y) 


= (2) nna (Rs) tm (7) 
x exp (-= (win? + wey”) ) ; o 


Although this provides a complete solution, the problem with which 
we started was not really typical because the potential energy contained 
only terms in x? and y? but no cross terms xy. In fact, we know that 
even in classical mechanics one must change to normal coordinates before 
the differential equations for the motion simplify, and the same is true in 
quantum mechanics. For the sort of problem we have just been considering 
the kinetic energy is already in a reasonable form, but one must diagonalize 
the potential V before attempting to separate the variables. Provided that 
the potential energy is diagonalized by an orthogonal transformation the 
kinetic energy will remain unaffected. 

Example 3.2.1. A particle of mass m moves in a plane under the influ- 
ence of a potential 
V = mw?(2? + cy + y”). 


Find the energy levels. 


Solution The potential can be written in the matrix form 


(ep) G) 
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The eigenvalues of the matrix 


(3) 


are 3 and 1, so there exist coordinates u and v such that 


vein 03 2)() 


1 
= 53™ (3w?u? +w? >). 


(Since the matrix associated to V is symmetric we know that it can be 
diagonalized by an orthogonal change of variables. In fact, by looking at 
the eigenvalues of the matrix we see that this could be achieved by taking 
u=(x+y)/V/2, and v = (x —y)/V2.) The possible energies are therefore 


(Mi + 7) hv/3w + (No + 7) fw, 


with N; and N2 non-negative integers. (It is worth noting that w; = J3w 
and w. = w are precisely the normal frequencies of the classical system, 
so, once we know those, we can immediately deduce the quantum energy 
levels.) Oo 


In three dimensions and more one can proceed in a similar way. After 
transferring to normal coordinates the variables are separated (one at a 
time) to reduce the problem to a number of independent one-dimensional 
oscillators. The energy is then the sum of their energies and the wave 
function is the product of their wave functions. (See the exercises at the 
end of this chapter.) 


3.3. Degeneracy 


An interesting special case of the original two-dimensional oscillator arises 
when the two frequencies w 1 and w2 coincide. Then the energy levels take 
the form 


E= (Ni + No+ 1) hw, (3.34) 


where w denotes the common value of w, and w2. Setting N = Ni + No, 
this may be written as 
E=(N +1)hw. (3.35) 


The ground state energy is now fiw and occurs just when Ni = No = 0. 
In general, however, there is more than one wave function giving the same 
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energy. For example, the first excited state has energy E = (1+ 1)hw = 
2hw, and this occurs both for Ny = 1, N2 = 0, and for Ni = 0, N2 =1. 
The wave function corresponding to the first of these possibilities is 


vio(x,y) = (ay (wiwe)* Ay (,/™=) Ho ( my) 


x eXp (-= (wia? + way”) ) 


2h ; 
m 
= Az exp (-= (wi a? + wey”) ) : (3.36) 
where A is a constant. Similarly, for another constant B, we have 
m 
wo1(z, y) = Byexp (-= (wz? aE woy?)) ’ (3.37) 


which is clearly not a constant multiple of #19. Since any solution with 
energy 2hw is a linear combination of the two functions io and o1, the 
solution space is two-dimensional. 


Definition 3.3.1. If the space of solutions of Schrédinger’s time- 
independent equation with energy F has dimension k > 1 then we 
say that the energy level is k-fold degenerate; if it is one dimensional 


we say that E/ is a non-degenerate energy level. (One usually says 
doubly degenerate rather than two-fold degenerate.) 


Example 3.3.1. Our calculation has thus shown that the first excited 
state of the two-dimensional oscillator with equal frequencies is doubly 
degenerate. If we take the general energy level FE = (N + 1)hw then the 
possible choices for (N,,.N2) are: (N,0), (N —1,1),..., (0,N). This gives 
a total of N + 1 possibilities so that the N-th excited state of energy E = 
(N + 1)hw is (N + 1)-fold degenerate. For the one-dimensional harmonic 
oscillator and square well all the energy levels were non-degenerate, because 
the wave functions were uniquely determined up to a multiple. 


Definition 3.3.2. If all the normal frequencies of a multi-dimensional 


harmonic oscillator coincide then the oscillator is said to be isotropic. 


Remark 3.3.1. Our calculations have shown that all the excited states of 
a two-dimensional isotropic oscillator are degenerate, and the same holds 


eee 
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in higher dimensions. However, it is not only isotropic oscillators which 
have degenerate energy levels: whenever two of the normal frequencies 
are rational multiples of each other some energy levels degenerate. (See 
Exercise 3.2.) 


Example 3.3.2. If the normal frequencies of the classical oscillator are 
2w and 3w then the possible energies of the quantum oscillator are 


E = [2(Ni +4) +3(N2 + 4)] hw 
= (2N, + 3N2 + 2) hw. 


The energy level E = 17hw/2 is degenerate because it can be obtained 
either with N, = 3 and No = 0 or with Ni, = 0 and Nz = 2. 


3.4. Momentum space 


For some quantum mechanical systems it is useful to use the Fourier trans- 
form, which we shall write in the form 


(FU)(0) = pee [oP Y (a) (3.38) 


The appearance of fi in the exponential is to ease the physical interpretation 
later on, and the constant factor outside the integral is chosen so that the 
inverse transform is just 


l oO 
zt) = — erPa/h dp. 3.39 
v(z) =e ae (Fv) (p) dp (3.39) 
If we multiply this formula by ¢(z), integrate it from z = —oo to oo, and 


interchange the order of integration on the right, then we obtain a useful 
result, detailed proofs of which can be found in most analysis books. 


Plancherel’s theorem 3.4.1. 


[ . Bea) de = | FORFVO) dp 


Putting ¢ = ~ we see that normalized wave functions have normalized 
Fourier transforms. 
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The utility of phe Fourier transform in quantum mechanics, as in classical 
differential equation theory, results from its effect on differential operators, 
which is summarized in the following result. 


Proposition 3:4-2. Let denote the Fourier transform; then for all 
differentiablé ¥ in 11, and all p in R, 


2 (FW(p) = D(FY(). 


Proof. By definitioy. - 


li 
ho ee 
oy st ety ci 
o.J ee 
to: 
2 - 
is) 
{ 
3 
R 
~e 
> 
=. 
< 
a 


(FPy)(P) 


il®,*) 


2 rl™y(c) ae} 
1 


oe a 2 
= pcg lem ve@|_ + (FO). 

Any normalizable wave function, #, can be approximated by one which 

tends to zero when || is large, so that the first term vanishes leaving the 

desired answer. : 


According to the de Broglie relation —thd/dz gives the momentum of a 
wave function. On the Fourier transform space this is just multiplication 
by p, so we shail often refer to the Fourier-transformed wave functions as 
momentum spaca wave functions. Of course, one cannot expect to have 
everything at onve, and the Fourier transforms make position calculations 
harder. In fact, 


eae e Fa xv) ax 
(FXO)(0) = pee [ e Fav(2) 


ey ae pele 2 

= 55 (se fe w(a) de) 
oars) 

= ihe, (FO)(P); 
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so that now it is positions which involve differentiation. 
For calculations involving momentum in three dimensions there is also 


an appropriate Fourier transform obtained by transforming each of the 
three components: 


(Fw) (p) = (2nh)~# iE PHI) dP. (3.40) 


3.5. Motion in a uniform electric field 


The potential V(x) = eF'x describes the effect on a charge e of a uniform 
electric field F along the x-axis. Schrédinger’s equation is therefore 
a? ay 


am On? +eFay = Ey. (3.41) 


By Fourier transforming this equation (and writing mn = Fw for brevity), 
we obtain 


1 27 : ap _. BP, 
om? W(p) + kere = Ey(p). (3.42) 


On introducing an integrating factor this simplifies to 
0 3 164 ~ 3/64 in 
: p° /6imkeF = p” /GimheF 
mer (c 3) E (c 0) , (3.43) 
and has the solution 


( efi (tamer = Ne~tEpleFh. (3.44) 


Taking moduli we see that ||? = ||? is constant, so that the wave function 
is not normalizable. This means that this problem has no bound states. 


This is physically reasonable, since the particle can always gain energy from 
the field. 


Let us now consider how the wave functions evolve. By Fourier trans- 
forming Schrédinger’s equation 
ap i? aw 
dt ~—- 2m Ax? 
and introducing the same integrating factor as before, we obtain 


é (er (ones) _ eF 5 (er*/emner g) ; (3.46) 


th +eFay, (3.45) 
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In terms of new variables u = p~— eFt and v= p+ eFt, this becomes 
o (ep*/eimner =0, (3.47) 
so that the general solution is of the form 
epi/SimheF h(t, p) = U(v) = U(p + eF t), (3.48) 
where the arbitrary function VW can be determined from the initial condition 
er /SimheF (0, p) = U(p). (3.49) 
We therefore have the solution 
| Dt, p) =e P /otmh olpterty'/omheF G0, y+ eFt), (3.50) 
which can be rearranged to give 


w(t, p) = p(0, p + eF t) exp ee G +eFtp+ sere) . (3.51) 


The wave function can now be found as a function of x by inverting the 
Fourier transform or by using the convolution theorem. 


Exercises 
3.1 Find normalized wave functions for the first two excited states of the 


one-dimensional harmonic oscillator with potential V = dmw?z?. 
3.2 A particle moves in two dimensions under the influence of the poten- 
tial ; 
V(a,y) = mw? (102° + 12zy + 10y’). 


Find the energy levels and calculate the associated degeneracy of each 
level. 


3.3 A particle of mass m moves in three dimensions under the influence 
of the potential 


V= musa? +y? +27). 


Show that the energy levels have the form (N + 3) hw where N isa 
non-negative integer, and find their degeneracies. 
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3.4 Let D* be the number of ways of writing n as the sum of & non- 
negative integers n1, n2,...,n,%. Show that the generating function 
can be written as 


°° (os) co 
kon _. nitne+... +N, 
5 Ds” = y > a 5 8 ; 


n=0 n1=0 n2=0 n.=0 


Deduce that the generating function is (1 — s)~* and hence or other- 


wise show that 
DE _ ieee 
n 


Deduce formulae for the degeneracy of isotropic harmonic oscillators 
in two and three dimensions. 


3.5 A particle of mass m moves on the z-axis under the influence of the 
potential 


1 
V(x) = yma? +x. 
By changing origin, or otherwise, show that the energy levels are 
1 1 é 
(w 53 3) fed — Fa?” 
for N a non-negative integer. 


3.6 Schrédinger’s equation for a two-dimensional isotropic oscillator can 
be written in polar coordinates as 


nh? [18 ( ap 1470] 1 a2 
=o Ee (3) a5 Pee ee, 
By considering solutions w that are separable in polar coordinates 
verify that the energy levels are of the form nfw where n is a positive 
integer. Find wave functions (r,6) for the two lowest energy levels. 


3.7° At time t = 0 the wave function of a free particle of mass m, moving 
along the z-axis, is given by 


pO, 2) = a-2e7l2l/o, 


Derive the wave function at a subsequent time ¢. Calculate the prob- 
ability that the momentum lies in the range [—h/a, h/al]. 


4 The hydrogen atom 


Herewlth It has been demonstrated that the Balmer terms come out cor- 
rectly from the new quantum mechanics. 


WOLFGANG PAUL!, On the hydrogen spectrum, January 1926 


4.1. The structure of atoms 


Whilst theoretical physicists struggled to understand the interactions be- 
tween matter and radiation, new experiments were revealing that atoms 
had internal structure. In 1897 J.J. Thomson showed that cathode rays 
consisted of a stream of negatively charged particles much lighter than any 
known atom. These became known as electrons. It was the 1896 discov- 
ery and subsequent investigation of radioactivity which forced the startling 
realization that, far from being indivisible and indestructible as had been 
supposed, atoms can disintegrate or be shattered into smaller pieces. 

Ernest Rutherford realized that by interposing a thin metal foil in the 
path of alpha rays emanating from radium, the subatomic particles emitted 
during radioactive decay could themselves be used to probe the structure of 
matter on an hitherto inaccessibly small scale. The climax of these experi- 
ments with his coworkers, Hans Geiger and Ernest Marsden, came with the 
discovery that atoms consisted mostly of empty space. This led Ruther- 
ford to formulate the popular picture of atoms as miniature solar systems 
in which negatively charged electrons orbit a positively charged nucleus un- 
der the influence of electrostatic rather than gravitational attraction. The 
nucleus was itself composed of positively charged protons and, as it later 
transpired, usually some uncharged neutrons, both of these being elmost 
2000 times more massive as an electron. (Ordinary hydrogen has but a 
single proton in its nucleus, otherwise there are usually some neutrons as 
well.) The chemical properties of an atom were determined by its electrons. 
Since atoms are electrically neutral, unless ionized, the number of protons 
matched the number of electrons, but the nucleus might contain more or 
fewer neutrons giving the possibility of chemically identical but physically 
distinct isotopes. 

Although still the popular image of an atom, this picture was soon dis- 
placed by the quantum mechanical picture that we shall describe in this 
chapter. The calculation that more than any other convinced most physi- 
cists of the correctness of quantum theory was that giving the spectrum of 
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the hydrogen atom. Niels Bohr had already managed to derive the spec- 
trum using the Hinstein-Planck law and Rutherford’s model of the atom, 
but his argument relied on a series of brilliant ad hoc assumptions, whose 
application to other problems often led to false conclusions. In 1925, af- 
ter some initial scepticism about Heisenberg’s theory, Pauli showed how it 
gave the spectrum for the hydrogen atom without the need for any extra 
assumptions. It was this more than anything else which convinced most 
physicists of the correctness of quantum theory. The calculation was a 
major tour de force which occupied Pauli for three weeks, but by the end 
of the year Schrédinger’s new method of wave mechanics had reduced the 
problem to solving a simple differential equation. 

In the simplest case of the hydrogen atom, and also for heavier atoms 
that have lost all but one electron due to ionization, there is only one 
electron. Since these two cases are mathematically identical we shall inves- 
tigate them both together. As in classical mechanics both the electron and 
the nucleus orbit around their mutual centre of mass. Since the hydrogen 
nucleus is almost 2000 times heavier than the electron whilst for heavier 
ions the discrepancy is even more pronounced, the centre of mass is almost 
at the centre of the nucleus. We shall therefore make the simplifying as- 
sumption that the nucleus of the atom is fixed. (We shall return to discuss 
this in more detail in Section 8.5.) 

If the nucleus carries a positive charge Ze and the electron has a charge 
~e then the electrostatic potential energy is V = —Ze? /4xeor, where r is 
the distance between the electron and nucleus, and ¢g is the dielectric con- 
stant of the vacuum. (The role of (47€)~} in electrostatic theory is much 
the same as that of the gravitational constant in Newton’s theory.) Re- 
ferred to spherical polar coordinates centred on the nucleus, Schrédinger’s 
equation is thus 

Ze? 


= 2 
Ba ve Hear ah (4-1) 


4.2. Central force problems 


When dealing with motion under central forces where the potential V = 
V(r) depends only on the distance r from the centre, it is natural to sepa- 
tate variables in spherical polar coordinates (r,0,¢). When the Laplacian 
is written in terms of these coordinates the Schrédinger equation becomes 


sin ot) + cea = +V(r)p = Ey. 


h? [1 ary 1 9a 
06 r2 sin? 6 Og? 
(4.2) 


r Or2 r2 sin 0 00 
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Multiplying by r?/# and substituting = R(r)O(6)®(¢) we obtain 


hn? F d? 1 < ( ) 1 ds 


eae ——— — (sind ) + —-5- 
Rar”) + Saino ao (SP ae ) + Sain? 9 dg 


+r°V =r7E, 

(4.3) 
The second and third terms in this equation depend only on the angular 
variables, whilst the others depend only on r, so that each group must be 


constant. That is, for constant \ we have 


2m 


1 ed d® 1 do 
@sind do (sine dO ) + Sain? 6 dg? 34) 
and A? 2 
Tr 2 2 
Preece iy (epcanamccaioer _ =F. 4.5 
2m (F dr? rh) ) aio iets ad 


On multiplying the angular equation by sin? 0, we obtain 


i dO 1d*& : 
ae (sin °F) + Bde? = —Asin’ 8, (4.6) 


in which the term ©”/® depends only on ¢ and the remaining terms only 
on 6. We must therefore have 6”/® = —? for some constant py, and 


sn@Oéd /{., ,dO 2 2 
———— —)-p*=— 6. 4.7 
© do (sno) Lb Asin (4.7) 


The ¢ equation integrates immediately to give ® as a linear combination 
of exp(+iud) when p ¥ 0. Since 6(¢+27) = (¢) this forces yu to be a real 
integer. When y» = 0 only the constant solution is periodic. We therefore 
conclude that we may take ® = exp(iu¢), where y is any integer. 

The @ equation can be rewritten in terms of c = cos@ using 


4d . ,dacd 2,4 2 d 
= ee f(a), 4.8 
sin O75 sin 87 de sin o7, (e Pes (4.8) 
to obtain Legendre’s equation 
d d® 
2 4)— 2_ 4) | _ y? 2 =0Q. 4,9 
(ec Na ((c 1) =) pO + A(c* —-1)O =0 (4.9) 


We shall not solve this equation in detail here (though it can be solved 
by standard methods), since it is more readily handled by the algebraic 
techniques that we shall introduce in Chapter 8. The main point is that 
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the solutions are singular at c = +1 unless \ = [(! + 1) with | an integer 
greater than or equal to |u|. When » takes such a value there is a unique 
solution P}* that is continuous on the interval [—1,1]. (When py = 0 this 
solution is a polynomial, P;, called a Legendre polynomial.) 


Definition 4.2.1. The full angular term 0% = P}’(6)e*#? is called a 
spherical harmonic of degree | and is usually denoted by Y/*(@, ¢). 


Theorem 4.2.1. The space of spherical harmonics of degree | has 
dimension 2! + 1. 


Proof. The space is spanned by the 2! +1 functions Y;" for integral 
w=—l—l+1,...,l-11. Oo 


The preceding discussion still leaves the radial equation, which on mul- 
tiplication by R/r takes the form 


d? l(i+ 1) 
Tae (Farr) er rR) +VrR= ErR. (4.10) 


This equation can be investigated by the same technique that we used 
for the harmonic oscillator, starting by examining the behaviour at large 
distances. For definiteness let us assume that V(r) tends to 0 as r + oo. 
(Other potentials, such as the quadratically increasing three-dimensional 
oscillator potential, can be handled similarly, but have different asymptotic 
solutions.) In this case the dominant terms in the equation can be written 
as 

d?(rR) QnE 

dr? SCO 


whose solutions are rR ~ exp(tr-/—2mE/hi). 

Now, if E were positive the argument of the exponential would be imag- 
inary and we should have |rR| = 1. This would mean that R is not nor- 
malizable since 


(rR), (4.11) 


Z |Ri?d°r = | |R\?r? sin OdrdOd¢ = 4r | |rR\?dr. (4.12) 
RS RS 0 
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(Near the origin rR must be less singular than r-? or the integral will 
diverge.) The same argument rules out the case of vanishing E BO We con- 
clude that E must be negative. For convenience we write E = ~h’«?/2m, 


with « > 0, so that 
rR(r)~e"™. (4.13) 


We shall therefore try to find an exact solution in the form 
R(r) = Lr) mmr, (4.14) 
Substitution into Schrédinger’s equation yields 
ore (s" —2nfl +n f - i aa =) +Vf =n*f, (4.15) 


r 
or, after simpltification, 


ii+1 2nV 
f" —2nf’ —- Ct Mi 5 


For particular potentials, V, this equation can be solved in series. 


f=0. (4.16) 


4.3. The spectrum of the hydrogen atom 


We are now ready to solve Schrédinger’s equation for a hydrogen-like atom. 
It is useful to introduce the Bohr radius, a = 4meofi” /me?, so that we may 
write the term 2mV/h? as -2Z/ar. 


Proposition 4.3.1. ‘iie permissible bound state energies for the 
equation 


are given by 


forn =14+1,/4+2,.... The corresponding wave functions take the 
form 

wntm(r) = constant - rth (Zr/a)e~2"/"2Y" (8, d) 
where LU is a polynomial of degree n—l. In particular, the normalized 
ground state wave function may be written as 


Wioo(r) = (23/a3)# ene 


Se TS ee Pe RE UT REE EAE MTS TU MNO GAN ENING SSO cus mee eI Ma ann Gn a exeegere 
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Proof. The variables can be separated as in the previous discussion, giv- 
ing the radial equation 


In order to ease the calculation we change the variable to p = Zr/a. 
The equation then reduces to 


@f Qandf Ul+1) ,2,_ 


We now try the series solution 
co 
F(p) = Yo axp*t? (4.19) 
k=0 
with ag # 0. This gives 


[oa] Loo} 
(k +c)(k +e— 1)axp**¢-? — 2(ax/Z) aC + c)az,p*te-} 
k=0 k=0 


foo] Co 
~UL+1) Dl agp*te? +29 azote? =0, (4,20) 
k=0 k=0 


or, collecting terms, 


foo] 
Do (k+e+I(k+e—l— Vagptte-? 
k=0 


=2 s ((k + ¢)(an/Z) — lJ a,p*ten-}. (4.21) 
k=0 


Equating coefficients of p°-? we obtain the indicial equation, e(c-1)= 
(+1). This time there is no real choice, for, to ensure that rR(r) gives 
a normalizable wave function, |r.R{? must have a finite integral near r = 0, 
which is only possible if 2c > —1, so we must take c =! +1 rather than 
ce = —l. (At this point in Schrédinger’s original notebook there is the 
exclamation: ‘The devil! It is finite at r = 0.’) The recurrence relation for 
the coefficients is therefore 


(k + 21+ 1)ka, = 2[(an/Z)(k +1) — lax-i, (4.22) 
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for k > 1. Arguing as for the harmonic oscillator we discover that, unless 
the series terminates, it behaves like exp(2r), wiping out the normaliz- 
ability that we had tried to achieve with the factor of exp(—«r). We must 
therefore force the series to terminate. Looking at the recurrrence rela- 
tion for a, we see that that will happen provided that ax = Z/n for some 
n=(kK+1)>1 


Definition 4.3.1. The positive integer n is called the principal quan- 


tum number. 


By substituting « = Z/na, we deduce that the energy is given by 


2K? A? Zz? 
eae 4,23 
En 2n Qmn2a2’ (4.23) 


which, on using the definition of a, reduces to 


1 Ze? 
oe ee. 4.24 
En 2n2 Ameoa a4) 
More succinctly, we may write E, = —Z?e?/87e9a and En = E;/n?. 


In the ground state, where n = 1, both / and » must vanish, and the 
series for f terminates after the first term to give 


(7) = age" = age 77/2, (4.25) 


In order to normalize the ground state wave function we require that 
l= | \W(r)|?r? sin Odrdodd 
3 
= foe] 
= an | Ireb(r)|? dr 
0 


ioe] 
= Anlaol? rie 2" dr 
0 


0 fe 
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The normalized ground state wave function may therefore be written as 
Vr00(7) = (23 /na?)? a La (4.26) 

In general the wave function %imn with principal quantum number n is the 

product of exp(~Zr/a)¥;"(0, ¢) with a polynomial L!,(Zr/a) known as an 


associated Laguerre polynomial. Figure 4.1 shows the graphs of the first 
three wave functions. 0 


Bearing in mind the exponential terms in the wave functions, the Bohr 
radius can be taken as an estimate of the size of the atom. When the 
experimental values of €9 and of e are substituted one obtains the value 
a~5x 1071! metres. 

For large values of n the energy EH, converges upwards to 0, the min- 
imum energy needed to escape from the nucleus altogether. (This is the 
energy corresponding to a parabolic orbit in the classical theory.) Strictly 
speaking, in the terminology of Definition 2.5.2 we have found the bound 
state energy levels; once the electron escapes from the nucleus one has to 
work with scattering states. 

The energy difference between the levels with principal numbers 7 and 


kis i ‘ 
€ - z) Ey, (4.27) 
and this is the energy available to be radiated away as a photon if the wave 
function of the electron changes from #; to ~. By Planck’s law the photon 
has frequency 


1 1\ £2 

Wik = € ~ z) ' (4.28) 
Conversely a photon of this frequency can change the wave function from 
wx, to w,. In other words the atom transmits and absorbs light only at 
certain well-defined frequencies. The series of frequencies corresponding to 
j = 2 was well known to spectroscopists as the Balmer series. The other 
series can also be measured to extremely high accuracy and provide a very 
sensitive verification of the predictions of quantum theory. 

The energy —E; itself also has a physical interpretation as the minimum 
energy that must be supplied to an electron to enable it to escape from the 
atom starting from the state labelled by j. This is the ionization energy. 

Finally we consider the degeneracy of the energy levels. 


Theorem 4.3.2. The energy level EZ, for the hydrogen atom has 


degeneracy n?, 
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FIGURE 4.1. The wave functions and radiel probability density as functions of r 
for the ground state and first two excited states of a hydrogen-like atom. The scale 
on the horizontal axis is the redius of the classical orbit that has the same energy. 
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Proof. We know that the space of spherical harmonics of degree | has 
dimension 2i+1. Any degree | that is strictly less than n can give the same 
energy E, so that we have a total degeneracy 


Sat +1)= 10 +1) -)= ye - ye =n?—0=n?, (4.29) 
t=0 t=0 t=0 


t=1 


as claimed. Oo 


Exercises 


4.1 Find those normalized wave functions for the first and second excited 
states of the hydrogen-like atom, for which / vanishes. 


4.2° In a two-dimensional model of the hydrogen atom Schrédinger’s time- 
independent equation becomes 


fi? [19 ( dv 1 ay e 
“oa Ee (3) = 59 | ape 


By separating the equation in polar coordinates, show that the energy 
levels are of the form —«/(2N + 1)”, for & a positive constant, and 
N =0,1,.... Find the degeneracy of each level. 


4.3 Schrédinger’s equation for a two-dimensional isotropic oscillator can 
be written in polar coordinates as 


2 
SL Es (“3) + nd + Late = Ey. 


By considering solutions 7 that are separable in polar coordinates 
verify that the energy levels are of the form nfw where n is a positive 
integer. Find wave functions w(r, 6) for the two lowest energy levels. 


4.4 Let Y(r) be a harmonic function on R* that is homogeneous of in- 
tegral degree | > 0. Show that if » = ¢(r)Y(r) is a solution of 
Schrédinger’s equation for the hydrogen atom, then ¢ satisfies the 
equation 


a? [-4 2136) e? $= Eb. 


rarer) + ry dr Areor 
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4.5° 


4.6° 


4.7° 
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By considering the asymptotic forms of solutions and making use of 
series show that the energy levels for bound states are 


e? 1 
Bi = — (=) 2(1 + k)?’ 


for k a positive integer. Find a solution of the form #(r) = z¢(r). 
[You may use the identities: V7(Y¢) = (V?Y)¢+ 2VY.V4+ Y(V7¢) 
and V¢@ = ¢'(r)r/r.] 


A particle of mass m moves in the spherically symmetric potential 


Rn? (a? ? 
2m (<-r) , 


where « > 0. Write down the time-independent Schrédinger equation _ 


and show that if the wave function is written 
r-te-nr'/? #(r)¥™(6, 8) 


then 


2,4 
jh sone 4 (Far + 2nta? —p- MCENEMA) po, 


where £& is the energy eigenvalue. Prove that the energy eigenvalues 
are given by 

h2 

a A 

oni (4nn + 1); 
where n = 0,1,2,..., and find A,. 


A particle of mass m moves in three dimensions under the influence 
of the spherically symmetric potential dmw*r?, Using the results 
of the previous question, show that the energy levels are given by 
En =(N+ 3)hw, N =0,1,2,..., where N is even or odd according 
as | is even or odd. 


In terms of the parabolic coordinates 
u=r(1—cos6), 


v =r(1+4+cos6), 
w = ¢, 
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Schrédinger’s equation for the hydrogen atom can be written as 


hf 4 [A / ay a ( ay 1 dy 
am oer E (5) + By (S| + eae} 
eee Oee. 2. 
27€9(u+v) 


By considering separable solutions = U (u)V(v)W(w) show that 
the bound states of the hydrogen-like atom have energies 


y= Ey. 


1 2762 


Pr 
” 2n2 4rega 


for positive integers n. What is the degeneracy of the energy level 
Ey? 


5 Scattering and tunnelling 


| am so happy to have escaped the terrible mechanics ... which | never 
really understood. Now everything Is linear, everything can be superposed. 


ERWIN SCHRODINGER, letter to Willy Wien, 22 February 1926 


5.1. Particle beams 


Ever since Rutherford’s pioneering work, passing alpha rays through a 
metal foil target, beams of particles have provided a means of probing 
the fine details of subatomic structure. Modern accelerators hurl particles 
through a target at velocities only slightly less than that of light. The 
wavelength of each particle shrinks as its momentum increases and it be- 
comes possible to distinguish features far beyond the resolving power of any 
microscope. In the simplest cases (such as Rutherford’s sedate Edwardian 
experiments) the particles emerge from their encounter unscathed, and it 
is by analysing the changes to their momentum that one must build up a 
picture of the target. To develop some feeling for what happens we shall 
look at the simplest case in which the particle suffers no deflection. We 
therefore consider particles moving along a line on which there is some kind 
of potential barrier. 

At large distances from the target the potential is approximately con- 
stant: Vz for large negative values of x, say, and Vg for large positive 
values of x. For simplicity we shall assume that for large enough x the 
potential is actually constant. Then Schrédinger’s equation in these two 
regions becomes 


Ah? dy; 
“Om dz2 + Vib; = Edy, (5.1) 
for 7 = Lor R. That is, 
ay 2m 
Tet aaa (EB ~ V3) vj. (5.2) 


This has solutions of the form 


pj = Aes? + Bye th? (5.3) 


where ky = 4/2m(E — V;)/#? and A; and B; are constants of integration. 
(Clearly, k; is real or imaginary according to whether E > V; or E < Vj. 
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When E > V; we take k; to be the positive square root of 2m(E — V;)/h?; 
otherwise we saat k; = ix; with «; positive.) 

According to de: Broglie’s law the term exp(tk,z) has momentum ik, > 
0, and so represents a wave moving to the right, towards the target. Sim- 
ilarly, the term exp(—ikyz) represents a wave moving left, away from the 
target. To the right of the target these roles are reversed and it is the right 
moving wave, exp(ikpyx), which is leaving the target, whilst the left mov- 
ing wave, exp(—tkpx), approaches it. Our main aim in this chapter will be 
to find the coefficients Ap and By of the outgoing waves in terms of the 
coefficients A; and Br of the incident waves. In practice we are very often 
interested in the situation in which the incident beam is fired at the target 
from the left. Since no beam is incident from the right (and there are no 
further targets to cause reflections) the coefficient Br must vanish, and we 
have to find Ar and By, in terms of Az. (Strictly speaking this interpreta- 
tion works only when & > Vp and kp is real. However, when FE < Vp and 
kr = ikp the wave function takes the form Ap exp(—K rx) + Brexp(Kpz). 
This time we have to take Br = 0 to avoid the embarrassment of an expo- 
nentially increasing wave function and probability density to the right of 
the barrier.) 

Within the target the potential changes very rapidly over distances so 
short that it is often convenient to approximate the change by a jump, or 
discontinuity, in the potential. This raises the question of what happens 
to the wave function at the jump. We shall impose the following matching 
condition: 


Assumption 5.1.1. The wave function, 7, and its derivative, ~’, 


are continuous at a potential jump. 


Remark 5.1.1. This is a reasonable requirement; elsewhere this continuity 
is an automatic consequence of the fact that the wave function must be 
twice differentiable in order for the Schrédinger equation to make sense. 


5.2. Reflection and transmission coefficients 


A useful tool in the discussion of beams is the probability current, which 
in one dimension (Exercise 2.8) is given by 


j(z,t) = AChYy! — py)/2mi. (5.4) 
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Proposition 5.2.1. For time-independent problems in one dimen- 


sion the probability current is a constant. 


Proof. The continuity of ~ and ~’ shows that the current is well defined 
and continuous across a potential jump. Away from discontinuities in the 
potential we may use Schrédinger’s equation, and its consequence, the con- 
tinuity equation, for the current. For time-independent problems, such as a 
steady beam, the continuity equation reduces to j’ = 0, so that the current 


is a constant. For continuity across potential jumps that constant must be_ 
the same in all regions. oF 


When the wave function takes the special form A exp(ikx)+B exp(—ikz) 
with & real, we may readily compute the probability current to be 


j= 7 Re [(Ae~*** + Bet#) k (Ae** — Be ***)] 


= aERe [(|A|? = |B|?) 4 (BAe*** me ABe-**=)] 


(This is the one-dimensional version of Exercise 2.9.) 

In practice the potential encountered by a particle in an accelerator as 
it passes through the target is likely to be very complicated and may have 
several discontinuities. The constancy of the current nonetheless enables 
us to keep track of what happens. 


Corollary 5.2.2. Suppose that for large negative x the wave function 
takes the form Az exp(ikyx) + By exp(—ikyz) and for large positive 
x the form Ap exp(tkpr) + Brexp(—ikrz), with ky and kp positive 
real numbers. Then 


k k 
AL? + F*1Bal? = |Bz? + FAR. 


If the beam is incident from the left, so that Br = 0, then 


k 
|Ax|? = |Bz|? + 5 lAr. 


Waar oe ene eee 


f 


POTENTIAL Jumro 


Proof. Since the current is constant its values on the extreme left and ex- 
treme right must agree. For large negative z it takes the value hk, (|Az,|? — 
|Bz|*)/m, and for large positive z it takes the value hkr(|Ar|? —|Br|?)/m. 
This gives 


ky (|Az}? — |Bz|?) = kr(|Aal? - |Bal’) , (5.6) 


from which the result follows on rearrangement of the terms. a) 


This result prompts the following definition: 


Definition 5.2.1. For a beam incident from the left, the reflection 
coefficient is defined to be |B, /Az,|? and the transmission coefficient 


is (kr/kr)|Ar/Az|?. 


For beams incident from the left the previous corollary can then be 
restated as follows: 


Corollary 5.2.3. The sum of the reflection and transmission coeffi- 


cients is 1. 


Remark 5.2.1. This result tells us that all the particles in the beam 
are either reflected or transmitted; none of them gets stuck in the target. 
(Given the enormous cost of producing the beams in modern accelerators, 
it would be annoying to lose anything.) 

In the commonest examples the potential tends to the same value for 
large positive and negative values of x, so that ky = kz and the transmis- 
sion coefficient reduces to |Ar/AL1|. 


5.3. Potential jumps 


As the simplest possible problem involving a discontinuity, suppose that 
the potential is given by 


_fv% ifz<b 
vea)= {i if x > b, (Bef) 


where Vp and Vj are constant. 


a 5 5 : 2 : x 
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As before, we have for 7 =0,1 


Dj eS Ajet*s® + ByeW**®, (5.8) 


where kj = 4/2m(E' — Vj) /h? and A; and B; are constants of integration. 
There is a single jump so we just have the two equations 


o(b) = ¥1(0) 


5.9 
vo(b) = vi(d. oe 
On substituting the expressions for w%> and 7% these become 
Ape” a Boe™*ko® = Ayet*ie a Bye~*6 
: , . 5.10 
ikp Apet*o? — ikoBoe~**0 = ik; Aye'**? — iky Bye. ey) 
The latter equation may be rewritten as 
Aya?” 7” Boe t0® = a (Are**2? = Bye~**°) . (5.11) 
Adding this to the first equ:.tion we obtain 
Ag = a (hat hy) 26 Ie Se hg = By) Bye FNM, (6.12) 
0 
Similarly by subtraction we arrive at 
: 1 _ 
Bo= x (ko _ k1) Ayeihotki)e + Bho (ko + ki) Byet(ko k1)b (5.13) 
0 


Proposition 5.3.1. Let so1 = ko + 1, dor = ko ~ ki, 


1 8q1e7 1018 doie719018 


2ko 


So1e 


for j =0,1. Then the waves on either side of the potential jump are 


related by 
Co = Moi(b)Ci. 


Proof. The previous formulae for Ag and By may be combined as 
Ao\ _ 1. f sore7*42Ay + dove 892° By (5.14) 
Bo} 2ko do1e*8019 A; + 891 6°01" By : 

from which the result immediately follows. Oo 


if 


MULTIPLE POTENTIAL BARRIERS 53 


5.4. Multiple potential barriers 


For a single potential jump the matrix notation that we introduced in the 
previous section is unnecessarily sophisticated since the problem is easily 
solved by direct methods. However, as we have already remarked, the 
particles in a scattering experiment are usually subjected to more than one 
jump. A slightly more realistic model of the potential barrier encountered 
in a particle accelerator would have a double potential jump of the form 


Vo if <0, 
V(z) = Vi if0<2<a, (5.15) 


V2 ifx>a, 


where Vo, Vi and V2 are constants (Figure 5.1). (The region where the 
potential is V; corresponds to the target.) 

We shall suppose that a beam of particles Ap exp(tkoxz) is incident on 
this potential barrier from the left. Some of the beam will be reflected from 
the barrier, so that for negative z the wave function will have the form 

Wo = Age*** + Boe **o7, (5.16) 
In the notation of the last section this is related to the wave function in 
the region [0,a] by 


Co = Mox(0)C\. (5.17) 


Similarly, extending the notation in the obvious way, we have at x = a the 
second identity 


Ci = Mi2(a)Ca, (5.18) 


so that overall 


Co = Moi (0) Mi2(a)Co. (5.19) 


The form of the wave function to the right of x = a is Agexp(ikexr) + 
Bz exp(—ikgx), but since no beam is incident from the right we must have 
Bo =0. That means that 


(5.20) 


Consequently we have 


Co = AoMo1 (0) Mi2(a) ( ¥ (5.21) 


54 SCATTERING AND TUNNELLING 


B------4 - 4 YN aa ee ee 
V 
or OO OO _—,N,,OO oes 
0 a 
a a a Na 
Bo--------- ee ee ee ere eee 


V 
a a a 1S | Pa ial 


0 a 


FIGURE 5.1. The probability density of a beam incident from the left on a potential 
well (top two graphs) or barrier (bottom picture). (The dashed line giving the 
energy level also serves as the baseline for the graph of probablility density.) To 
the left of the jumps there is usually interference between the incident and reflected 
beams, though, as the middle picture shows, for certain sizes of well there may be 
resonance leading to perfect transmission. To the right there is only a transmitted 
beam, whose amplitude is constant. In a well the potential energy is lower, so 
classically the particle would speed up, and so it would be less likely to be found 
there. For a barrier the reverse would be true. 
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By multiplying out the various matrices on the right-hand side we can 
now find expressions for Aj and Bog in terms of Ag. In fact we have 


Ao =A 1 So. doi ; 8 12¢@7 i120 dyge~ #8124 1 

Bo 2 Akoky do. S01 dj 2e*8124 $gei%120 0 
a Ag S01 do1 Sige 14120 
4koki do So1 dj 2e*312% 


_ Ao (8918127 *412 + dor dize*812* 
do1sige~*7#24 + 891 dy2e*12¢ 


(5.22) 


Proposition 5.4.1. If FE > V, so that k, is real, then 


Ay \ _ A gtkoa (ko + k2)k1 cos kya — i(kgko + k?) sin kya 
Bo 2koky (ko = ko) ky cos kia = i(keko = k?) sin kia , 


If the energy E > max{V, Vi, V2} so that all three & are real, the 
reflection coefficient is given by 


Ba |? _ (Ko — ko)?k? cos” kia + (koko — k?)? sin? kia 


~~ (ko + ka)?k? cos? kya + (koko + k2)? sin? kya’ 
and the transmission coefficient by 


kea|Aa|? _ Akokak? 
Ko|Ao|? (ko + ka)2k? cos? kya + (koko + k?)? sin? kya 


Proof. Taking the formula for Co and substituting for each s and d in 
terms of the k; we obtain 


es Ag etk2a (ko + ki) (ki + kz)e7**14 + (ko = k1)(ky = kp)er*1e 
° Akoky (ko — k1)(ki + kz)e~**** + (ko + k1)(k1 — ko )e*2 

(5.23) 
When £ > V; this can be rewritten in the stated form 


Co Ag gtkaa ts + kz) ky cos kya — i(keko + k?) sin kya 


~ Dkoki (ko — ke) key cos kya — i(kgko — k2) sin of) - (5.24) 
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Taking the ratio of components on the right-hand side we obtain the re- 
flection coefficient: 


Bo|? __ | (ko — ka)ki cos kia — i(kok — k?) sin kia a (5.25) 
Ao| — | (ko + 2)k1 cos kya — i(keko +k?) sin kia 


If E > max{Vo, Vi, V2} this reduces further to the form given. Similarly 
the transmission coefficient may be found by comparison of the first com- 
ponents in the formula for Co. oO 


Remark 5.4.1. One may now check directly that their sum is 1, in accor- 
dance with Corollary 5.2.3. 


5.5. Tunnelling 


In many situations of practical significance the middle potential V; is 
greater than the energy of the incident beam, so that kj = tx). ‘The 
algebraic formula connecting the wave functions on the left- and right- 
hand sides of the target is still valid for imaginary k; and it now gives the 
following expression for Co: 


Ag tkea (ko + ik1)(K41 = ike)e™* + (ko - im) (K1 + pee ; 
Tara (ko — ik1)(K1 — ike)e™!? + (ko + ik) (Ky + eae 
5.26 


This simplifies to give 


Ag etkea (ko + k2)K1 cosh ka + ili ~ kok2) sinh oa (5.27) 
Qkow1 (ko — k2)«1 cosh Kya — i(Ky + kok) sinh 1a 


The reflection and transmission coefficients may be calculated exactly as 
before. (For simplicity we consider only the case when ko and ke are real.) 
The most interesting point is the fact that the transmission coefficient 


kalAsl? _ akokowt 5,08) 
ko| Aol? (ko + kq)?K? cosh? Kia + («2 - koka)? sinh* ka 
does not vanish. a 
This example is of far more profound significance than the simplicity 
of the above calculation might suggest. It lies behind many of the most 
important quantum devices. The crucial point is that a beam of classical 
particles of energy E would have insufficient energy to penetrate a barrier 
of potential V; > £& at all; they would all be reflected back like a ball 
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FIGURE 5.2. A beam that is incident on a barrier from the left can tunnel through, 
even though classically it would lack the energy to traverse the obstacle. 


hitting a high wall. Thanks to their wave properties quantum particles can 
‘tunnel’ through a barrier too large for them to surmount (see Figure 5.2). 

Devices that exploit this phenomenon are numerous and include the 
nuclear reactor, the transistor and the superconducting Josephson junction. 
A recent application occurs in the scanning tunnelling electron microscope 
(STEM), for which its inventors, Gerd Binnig and Heinrich Rohrer, received 
the 1986 Nobel Prize in Physics. In this device a fine tungsten needle, its tip 
sharpened to a point only one atom across, is brought within nanometres 
of the sample to be investigated (1 nanometre = 10—-° m). Electrons can 
tunnel across the short gap from the sample to the needle producing a 
detectable current which depends sensitively on the size of the gap. By 
measuring the current and varying the height of the needle as it is drawn 
across the sample in a carefully controlled way a computer relief map of 
the sample can be built up. The device can easily resolve individual atoms 
and promises to be of enormous use in surface chemistry. More recently it 
has been used to manipulate individual atoms too. 


5.6. Potential wells 


This method is not only applicable to problems involving beams. Consider, 
for example, a particle in a potential well 


a 


where Vo > E > 0. This time k; is real and we may write kg-= ko = ik, 
with « positive. The wave function to the right of the well is Ap exp(—Kx)+ 
Bz exp(«z). Physically one would expect the wave function and probability 
to decay exponentially in the region of high potential, so we take Bo = 0. 


foplyrmre 
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Similarly on the left, where the wave function is given by Ao exp(—Kz) + 
Boexp(kxz), it is Ap that must vanish to ensure decay for large negative 
values of z. By Proposition 5.4.1 we have 


( 0 ) Az xa eee oe (Kk? — k?) oe) (5.30) 


Bo) — 2kin” (x? + k?) sin kia 
For consistency we require that 
2k1« cos kya + (Kk? — k?) sinkya = 0, (5.31) 
that is 
(k? — «*) tan kya = 2k K. (5.32) 
Now, we also know that 
AK? nk? 
= VieasPpo es 5.33 
am + Vo am F) ( ) 
so that « and ky are related by 
2m 
Ke + ke = rade (5.34) 


One can now eliminate « between the equations to obtain a condition on 
k, for solutions to exist. For very large Vo >> E we have « >> ky, so that 


2k 1K 
5.35 
k? — «2 fe88) 
is very small. In the limit of infinite « this forces kya = na for some integer 
n, and energies 


tankja = 


Renz? 

"Ama * 

These are precisely the energies of the particle in a square well potential 

that we derived in Section 2.2. This is reasonable, since the enormous 

potential jumps at 0 and a should effectively confine the particle within 
the interval [0, a}. 


(5.36) 


5.7. The scattering matrix 


For a complicated succession of potential steps it is more efficient to adopt 
a purely matrix approach rather than following the progress successively 
across each barrier. We already know that the coefficients change from left 
to right across a series of barriers according to the formula 


Cr = MCR, (5.37) 
where M denotes a suitable product of matrices of the form M;,(b). In 


other words we may capture the overall effect of a succession of potential 
jumps, no matter how complicated, in a single matrix equation. 


i 
+ 
i 
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Remark 5.7.1. By direct calculation, 
det: (Mox(b)) = (861 ~ doy) /4k§ = ka /ko. (5.38) 


Since the outgoing k from one step becomes the input to the next there is 
a pairwise cancellation of terms to give det(M) = kz /ki, where kp and ky 
are the values of & appropriate to the extreme right and left of the barriers. 
If, as is often the case, the potential is the same at the two extremes, then 


det(M) = 1. 


Although M is the most useful matrix for one-dimensional problems, 
there is another matrix that generalizes more readily to higher dimensions. 
The four coefficients A,,B,, Ar, Br appearing in C;, and Cp fall into two 
pairs: A; and Bp are associated with waves incident on the barrier, whilst 
Ar and By describe waves leaving the barrier. 


Definition 5.7.1. The scattering matriz S connects the incoming 
and outgoing coefficients by the formula 


By comparison with the earlier equation one can find the entries of S in 
terms of those of M. Indeed if 


An = Maa Mab Az 
(3) ~ Ge va Ga) (5.39) 
then 
Br\ _yy-i( -Mo 1 A, 
(2%) = Me taeda i) ( ae (5.40) 


Theorem 5.7.1. Ifk, and kp are real and equal, then the scattering 


matrix S is unitary, that is S*S = 1. 


Proof. If ky, and kp are real and equal then Corollary 4.2.2 simplifies to 
give 
\Ar|? + Bri? = |B, |? + |Ag|?. (5.41) 
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This can be rewritten as 


(az Bay(St)= (Be Aa) (4) = Cae Bayes (5). 


(5.42) 
Since this is true for all choices of Ay and Br the matrix S must satisfy 
S*S = 1, and is therefore unitary. Oo 


Remark 5.7.2. When kz and kp are not real the matrix S can be far from 
unitary. Indeed, if one applies the same mathematical ideas to the case of 
the finite potential well described in the previous section then we see from 
the fact that both A, and Br vanish that the scattering matrix S must 
be singular. In fact, considered as a function of the incident momentum 
k,, the scattering matrix has poles at the bound state values of k,. This 
provides a useful link between the scattering matrix and the bound state 
energies. 


Exercises 


5.1 Rederive the results of Proposition 5.4.1 by direct solution of the four 
matching conditions at the potential jumps. 


5.2 Calculate the transmission and reflection coefficients for a beam of 


particles incident on the following potential barriers from x = —o0: 
Vo if {0, a] 
=! o WZeE j0,a}; 
Ye : if x ¢ [0, a}. 
mn 0 ifze [0,a] 
if x a); 
Vile iy, if « ¢ {0,a]. 


If k2 = 2mE/h? and kg = 2m(E — Vo)/h’ are fixed calculate the 
values of a for which the transmission coefficient has its maximum 
and minimum values in each case. 


5.3 Show that the matrix Mo1(b) of Proposition 5.3.1 can be written in 
the form : 
Mo1(b) = Uo(b) (uw of Pov") Uy (b)* 


v7) ea) 


where 
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are an orthonormal basis of C?, and 
e7iksb 0 
U;(b) = ( 0 on) . 


5.4° A beam with energy fi?k?/2m and density |A|? is incident from large 
positive values of x, parallel to the x-axis, on a potential barrier of 


the form 
0 if z >a, 
v= {-% if0<2 <a, 
fore) ifz <0, 


where Vo is a positive constant. Show that the wave function for 
z > acan be written as 


(a) =A (ex i eltete)) 
and find exp(i¢). 
5.5 <A particle of mass m moves in a potential 


VY ifz ¢ (0,al, 
ve)= {0° reed 


Writing the bound state energies in the form E = h?k?/2m, derive a 
formula for k directly from Schrédinger’s equation and the continu- 
ity conditions. Check that your formula is consistent with equation 


(5.35). 
5.6 A particle of mass m moves in the three-dimensional energy well 
0 if0<r<a 
Vie ‘ue ifr >a. 


Show that Schrédinger’s equation has a continuous solution, #(r), 
with energy E = fi7k? /2m < Vo provided that 


2mVo 


B= 2 sin?(ka). 


Show that for very large Vo this gives energy levels close to those in 
Exercise 2.4. 


5.7 Find the probability current for the wave function p = Aexp(—«z)+ 
Bexp(«x) when « is real. 


+ 
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5.8° Let V(x) = 0, |z| > R, and the wave function (x) corresponding 
to particles of momentum fk incident on the potential from co be 


given by 
( = etka + ae the | r< —R, 
whe bette, z>R. 
Show that there is another solution 
_ f bee, 2<-—-R, 
x(#) = e~ tk _ (ba/b)e**, 2 > R. 


Deduce that the reflection and transmission coefficients are the same 
for particles incident from —oo as for particles incident from +oo. 


5.9 Use the scattering matrix with Br = 0 to show that 
Ar <% AL Moa 
Br = Mop det M }° 


Find the scattering matrix for the square well of Section 5.6. Show 
that as a function of « it has poles where equation (5.31) is satisfied. 
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6 The mathematical structure of quantum theory 


You are only going to spoil Helsenberg’s physica! ideas with your futile 
mathematics. 


WOLFGANG PAUL! to Max Born, 19 July 1925 


6.1. States and observables 


We have already mentioned that Schrédinger’s formulation of quantum me- 
chanics was slightly preceded by a totally different algebraic formulation 
proposed by Heisenberg. Although Schrédinger’s methods were so much 
simpler that they were generally preferred, both approaches seemed to give 
the same answers, so it was natural to ask how they were related. This 
was particularly important because some features of quantum theory were 
easier to understand in Heisenberg’s formalism. In fact, as we have already 
noted, Schrédinger was able to show that the two methods were essentially 
equivalent. Soon mathematicians like Hermann Weyl and John von Neu- 
mann were able to find a general mathematical scheme that encompassed 
both approaches, and it is this that we shall now describe. 

There are two particularly important concepts in a physical theory: the 
states of the system, which we have so far represented by wave functions, 
and the observables, quantities like position, momentum and energy, that 
we might want to measure. The theory links these in the form of rules that 
tell us how to calculate probabilities of events or to find the expectation 
value of a certain observable when the system is in a given state. 


Definition 6.1.1. The states are described by non-zero vectors in a 
complex inner product space #, and two vectors describe the same 


state if and only if one is a multiple of the other. 


The space H is chosen to suit the particular problem under considera- 
tion, and it is usually subject to some additional technical restrictions that 
we shall discuss in the next section. Although 1 can be finite dimensional 
as in the matrix treatment of scattering theory, it is more often infinite 
dimensional. For example, # is often the vector space of complex-valued 
wave functions ~ for which 
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(to ensure normalizability), and with the inner product defined by 


(sly) = |) Boewee) a (6.2) 


The fact that we have separated the two vectors by a vertical line rather 
than a comma accords with the notation used in physics and will serve 
as a reminder that we are following a physicists’ convention that makes 
inner products linear in the second variable and conjugate linear in the 
first. (Although different from that current in most algebra texts, this 
convention is gradually gaining ground amongst mathematicians too, and 
it is almost universal in textbooks on quantum theory. Physicists, following 
Dirac, often emphasize the inner product structure by writing the vectors 
inside pointed brackets, for example |), or (¢| for an element of the dual 
space, but we shall not usually bother with that.) 


Definition 6.1.2. A vector w is said to be normalized if 


wi? = 


(ply) = 1. 


In the case of wave functions this accords with the previous definition, 
equation (2.17). The fact that vectors that differ only by multiples describe 
the same physical state means that the physics is unaffected when we nor- 
malize a vector by multiplying by a suitable constant. Apart from rescaling 
states and multiplying them by phase factors such as exp(—iHt/fi), the vec- 
tor space structure permits addition of vectors, which makes it possible to 
account for superposition and interference effects. 

Observables are described by certain kinds of linear transformation. We 
should first recall that, in algebra, the adjoint of a linear transformation A 
is defined as the unique linear transformation A* that satisfies 


(A* dlp) = (AlAY), 
for all ¢ and ~ in H, and that A is called self-adjoint if A = A*, that is if 
(g|Ap) = 
for all ¢ and w in H. In infinite dimensions this definition is really rather 


restrictive, so we shall refine it in the next section, once we have looked at 
its application. 


(6.3) 


(A*@|y) (6.4) 


STATES AND OBSERVABLES 65 


Definition 6.1.3. .The observables in quantum mechanics are de- 


scribed by self-adjoint linear transformations on H. 


Usually one refers to linear transformations in quantum mechanics as 
operators and to these as self-adjoint operators. Typical examples in 
Schrédinger’s theory are the position and momentum operators: 


Definition 6.1.4. The position operator X in three dimensions has 
components X; for 7 = 1,2,3 that are defined on vectors ~ in H by 


(XsP)(x) = xy H(x); 


the momentum operator P has components P, for j = 1, 2,3 that are 
defined on differentiable ~ by 


h Ow 
P; == ———(x); 
(PW) = F 54, 
each being defined for 7 = 1,2,3. In one dimension the position and 
momentum operators take the simpler form 


and (Py)(a) = 2%. 


(X¥)(z) = xp(z) 


It should be noted that the momentum operator is defined only on func- 
tions which are differentiable, and there are also subtler restrictions. For 
example, P and X can only be applied to those wave functions ~ for which 


the integrals in 
wor = f ES 


|XvlP = a 2? | p(a)|? de 


are well defined, otherwise their images are not in H. We shall therefore 
have to allow for operators, A, which are defined only on a subspace. 
To see why these operators are self-adjoint we note that 


(6.5) 


and 
(6.6) 


(Xdl¥) = i: O(a) v(2) der = i Fe\eb(n) de = (6|Xv). (6.7) 
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One can show formally that P is self-adjoint, using integration by parts: 
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We already require ¢ and ~ to be both differentiable (so that P is defined) 
and normalizable, so we expect that (¢~)(z) should tend to 0 as |x| — oo. 


Then 
Paw) = | EL ae = (@lPy), (6.9) 


and P is self-adjoint. (It should be noted that the self-adjointness condition 
depends on boundary conditions satisfied by wave functions, and this is 
typical.) The simpler alternative way to deal with momentum is to work 
with the Fourier transform, and use momentum space wave functions. The 
Fourier transform is linear and Plancherel’s theorem 3.4.1 tells us that 
it does not matter whether we evaluate inner products using the wave 
functions ¢ and w or their Fourier transforms, because 
(FO|Fd) = (Gv). (6.10) 
Recalling that a linear transformation that preserves inner products is said 
to be unitary, we see that the Fourier transform is unitary and respects 
all the important quantum mechanical structure of H. Proposition 3.4.2 
can be reinterpreted as saying that (FPy)(p) = p¥v(p), from which the 
self-adjointness of P follows by the same argument as for X. 


Definition 6.1.5. The Hamiltonian operator, H, is defined on twice 
differentiable functions ~ by 


2 
(HY) (x) = -F_V*V(%) + V)YQ), 


or symbolically, by H = |P|?/2m + V(X). 
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In one dimension one takes the Hamiltonian operator H = P?/2m + 
V(X). With this definition Schrédinger’s time-independent equation can 
be written as 

Hy = £4, (6.11) 


and interpreted as an eigenvalue problem, with the energies as eigenvalues. 
Formally H is self-adjoint for 


x 1 *\2 *\ 1 2 as 
H 7 eee) + V(X") = 5 P + V(X) =H. (6.12) 
However, more careful consideration of its domain of definition shows that 
H is really only self-adjoint for certain potentials V, because H and H* 
may not be defined on the same set of vectors. (Fortunately, for all the 
potentials that are of interest to us, H really is self-adjoint.) 

There is, of course, an important logical distinction between the states 
and observables themselves and the corresponding mathematical descrip- 
tions that we use, but we shall not distinguish them by using different 
notation and terminology, as it should be clear from the context to which 
one is referring. 


6.2. Some mathematical refinements 


In order to be able to handle some of the technical difficulties which have 
already appeared and to work more confidently in infinite dimensions, it is 
customary to embellish 7 with some extra structure. However, although 
this is important in clarifying some ideas and making possible a mathemat- 
ically rigorous treatment of the subject, it does not substantially alter the 
main physical concepts, and we shall not make much use of it later, so that 
this section may be omitted on a first reading by those who are prepared 
to take such matters on trust. 

The state space, H, is usually subject to two further constraints, which 
exploit the fact that the norm enables us to define a distance, d(¢, p) = 
||¢ — ||, between vectors. The first of these is that it should be complete, 
in the following sense: whenever {jn € H} is a sequence of vectors, such 
that, for any positive «, there exists an integer N, with 


lm — Pnll < € (6.13) 
for all m,n > N-., then there exists a limit vector % € H such that 
In — pl| > 0. (6.14) 


This is a physically sensible condition, since it is often useful to take limits, 
as, for example, when one wants to sum an infinite Fourier series of the 
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kind we encountered in Section 2.3. (For more detail see Appendix A1.1.) 
Finite-dimensional spaces are automatically complete, and a general the- 
orem of topology tells us that any metric space can be completed by a 
procedure analogous to that for constructing the real numbers from the 
rationals. There is therefore no serious loss of generality in assuming 1 to 
lete. 

i Tie oad condition is not strictly necessary, but holds in almost all 
the examples which are of interest. We recall that a subset S of H is dense 
if for any vector y € H and any ¢€ > 0 there is a vector ¢ € S such that 


ld - vll <e. 


The second requirement is that 1 should contain a countable dense subset. 
This is again automatic in finite dimensions, because we may choose 2 basis 
and then take for S the vectors which have rational coordinates. 


(6.15) 


Definition 6.2.1. A Hilbert space is a complex inner product space 
which is complete and contains a countable dense subset. The state 
space is assumed to be a Hilbert space. 


The discussion of the observables is rather more delicate than that for 
the states. We have already noted that a linear operator, A, may be defined 
only on a subspace, D(A) of H, called the domain of A. We shall assume 
that this domain is dense in the sense of the above definition. We must also 
now modify the definition of the adjoint. We take for its domain, D(A*), 
the set of vectors » € H such that for all ¢ € D(A) there exists a vector 


We H with 


(Adl)) = (9]%). (6.16) 


In fact, W is uniquely determined, since the difference, €, of two such vectors 
would satisfy (¢|€) = 0, for all ¢, giving 


Well? = (€ — d]€) + (lg) = (6 — O18). (6.17) 
The Cauchy-Schwartz—Bunyakowski inequality would then give 
WEN? < Io ~ €Nliléll. (6.18) 


By the density of D(A), we know that for any « > 0 we can find @ such 
that ||¢ — é|| < . So for all positive «, we have él? < ell||, which forces 
€ to vanish. Moreover, since w and Y both occupy the linear slot in the 


eiacere ce, | 
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inner product WY depends linearly on 7, and we may write V = A*y, with 


A* a uniquely defined linear transformation, which we naturally call th 
adjoint. , 


Definition 6.2.2. If D(A) = D(A*) and on this common domain 
A = A* then A is said to be self-adjoint. This is the sense in which 


observables in quantum mechanics are required to be self-adjoint. 


Often the trickiest part of a problem is to show that a Hamiltonian really 
is self-adjoint. 

This approach to quantum mechanics using Hilbert spaces and self- 
adjoint operators is not the only possibility. One can also sidestep some of 
the problems by noting that most of the operators that one wants to use 
in practice are defined on a common dense domain, S, consisting of the 
space of all infinitely differentiable wave functions, 7, for which XJ P*y is 
normalizable for all positive integers j and k. This is actually too small to 
be of much use, but it has a very large dual space, S’, called the tempered 
distributions, to which most of the interesting operators can be transposed. 
However, we shall not avail ourselves of this approach. 

The interrelationship between the mathematics and the physics is de- 
scribed in more detail in von Neumann’s book The mathematical founda- 
tions of quantum mechanics. (Indeed, it was in this book that the term 
‘Hilbert space’ was first introduced and some of the important theorems in 
the area first proved.) 


6.3. Statistical aspects of quantum theory 


There are mathematical advantages in working directly with expectation 
values of observables rather than with the probability densities. 


Definition 6.3.1. The expectation value of an observable A in a state 
described by the vector w in D(A) is 


E4(A) = (v4) 


vil?” 


Remark 6.3.1. When y is normalized this takes the simpler form Ey(A) = 
(p| Ay). When the state is clear from the context one often abbreviates 
the notation to Ey(A) = (A). 
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Remark 6.3.2. Although it is only for self-adjoint operators A that this 
formula has any physical significance it will sometimes be advantageous to 
use it for general linear transformations. 


To see why we adopt this definition consider the case when A is the 
potential energy operator V(X), and ~ has been normalized. Then 


F(X) = WIV(XIY) = f HEV G@Wa))az = f Vee)! ax, 
(6.19) 

which is the classical formula for the expectation of V(x) when z is ran- 
domly distributed with probability density |1(z)|?. Thus our assumption is 
consistent with the previous Definition 2.4.1. Indeed knowing this formula 
for all potentials V is sufficient to determine that the probability density 
must be |7)(x)|?, so that, despite the simplicity of the formula, we have not 
sacrificed any information by working with expectation values rather than 
probability densities. 

Statistical calculations involving momentum are often greatly simplified 
by Fourier transformation. For example, Plancherel’s theorem gives 


Ey(P) = (b|Py) 
" ke FOO) FPY)(p) dp 


= iA pl(Fv)(p)|? dp, (6.20) 


so that |(Fw)(p)|? plays the role of the probability density for momentum. 

Before going on to show that the formula for expectations also deter- 
mines the probability distribution for the energy we shall consider some of 
the mathematical properties that make the definition plausible. 


Proposition 6.3.1. For every ~ in H the expectation value Ey has 
the following properties: 

(i) Ey(1) = 1, where 1 denotes the identity operator on H. 

(ii) Ey (A) is real for all self-adjoint operators A. 

(iii) Ey (A) 2 0 for all positive operators A. 

(iv) Ey(A) depends linearly on A, that is Ey(#A+ 6B) = aEy(A) + 
BEy(B) for all complex numbers a and G and all linear operators A 
and B. 
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Proof. (i) When A is the identity operator 


— bly) _ 
E,(1) = Twi? > 1. (6.21) 
(ii) Since A is self-adjoint we have 
cy(A) = WlAW) _ (Ave) WA _ ey an) 


ell? teil? Ilvl|? 
so that Ey(A) is real. 
(iii) We recall that a linear transformation A is said to be positive if 
(pl|Ag) > 0 (6.23) 


for all vectors . From this definition it immediately follows that Ey(A) 
is non-negative. In the most important applications A has the form B*B, 
and then one can obtain a more precise result: 


|WIPEy(B*B) = (p|B*BY) = (BY|BY) = ||Byll?. (6.24) 
(iv) Finally we note that 


SON VERY]: Wee ccd 


{I I? 
_ Way) , 2(vlBv) 
Te TP Tee 
= a£y(A) + BEy(B), (6.25) 
completing the proof. ia) 


We shall often write operators, cl, that are constant multiples of the 
identity just asc. From the expectation values the variance of an observable 
A can also be calculated as 


Ey ((A — Ey(A))?) = Ey(A?) — Ey(4)?, (6.26) 


The left-hand side is clearly positive by (ii) above, which justifies the fol- 
lowing definition:. 


Definition 6.3.2. The dispersion of the observable A in the state w 
is given by 


Ay(A) = [Ey(A2) — Ey(A)?]?. 
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This is the quantum theoretical analogue of the standard deviation of a 
classical random variable. 

It is natural to ask whether a quantum mechanical observable can ever 
have a precise value, that is whether the dispersion can vanish. 


Proposition 6.3.2. The dispersion of A in the state wy vanishes 
if and only if ~ is an eigenvector of A. Moreover, in this case the 


associated eigenvalue is Ey,(A). 


Proof. If 7 is an eigenvector of A, with Ay = ay, say, then 


_ Why) — lad) _ 
Ev(4) = "Tae we 2) 
so that 
Aw = Ey(A)e. (6.28) 
We also have the identity 
Was Fete = Ey((A-Ey(A)1)2) =Ay(A)?, (6.29) 


from which it is clear that Ay(A) vanishes if and only if Ay = Ey(A)y, 
that is if and only if w is an eigenvector of A with eigenvalue Ey(A). The 
result now follows immediately. a 


Remark 6.3.3. As the eigenvalues are given by expectations, Proposition 
6.3.1(ii) tells us that they must be real. We also note, for future use, that 
if ¢ and w are both eigenvectors of A with distinct eigenvalues ys and 4, 
respectively, then 


(Agly) — (14d) = (2 — A)(9IY)- 


The left-hand side vanishes for self-adjoint A, and since @ = 4 # A, we 
must have (¢|) = 0, that is ¢ is orthogonal to 7. 


The eigenvectors of an observable A (also called eigenfunctions or eigen- 
states in the Schrédinger formulation) thus play a distinguished role as the 
states in which A takes a precise value, namely the eigenvalue. Even when 


STATISTICAL ASPECTS OF QUANTUM THEORY 73 


the wave function w is not itself an eigenvector of A, it may be possible to 
expand it as a linear combination of orthonormal eigenvectors, 7), 


v= dreads. (6.30) 
r 


(One can even allow infinite sums, provided that the partial sums satisfy 
the convergence condition of (6.13), since then (6.14) gives a limit.) The 
coefficients c, are as usual given by the formula 


ca = (bald), (6.31) 


which is obtained by taking the inner product of the expansion formula 
with the vector 7, and using the orthonormality. 
Using the expansion, and supposing that Ay, = a), we then have 


(blAv) = So ca(plAda) 
» 


= Seren (diya) 
r 

= > cxaty 
BN 

= Sraaleal?. (6.32) 
Xr 

So, if ~ is normalized, we have 
Ey(A) = >_ aaleal?. (6.33) 
» 


Similarly for any polynomial f we obtain 
Ey(f(A)) = 30 f(aa)leal?. (6.34) 
a 


If # is not normalized we can replace it by w/||?|| to obtain a similar formula 
with |c,|? replaced by |c,|?/||||?. We have now proved the following result: 


Proposition 6.3.3. If it is possible to find an orthonormal basis of 
H consisting of eigenvectors of A then the probability that A takes 


the value a in the state wp is |(walv)|?/Ilv|l?. 
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We have already observed that Schrédinger’s time-independent equa- 
tion can be interpreted as an eigenvalue equation, and the energies as 
eigenvalues. The probability of obtaining a given energy E postulated 
in Assumption 2.6.1 is thus recovered as a special case of Proposition 6.3.3 
when A= H. 

In finite dimensions we know that there is always an orthonormal basis 
of eigenvectors for any self-adjoint operator A, so that such an expansion is 
always possible. Unfortunately, in an infinite-dimensional space not even 
self-adjointness is enough to guarantee that an operator has any eigenvec- 
tors, let alone enough to form a basis. For example, if ~ were an eigenvector 
for the momentum operator P in one dimension with eigenvalue p then we 
should have 


Pw = py. (6.35) 


Written out explicitly this becomes 


2 = 2 (6.36) 
so that . 
p(x) = ce'P2/, (6.37) 


for some constant c. We have already seen that this plane wave has infinite 
norm, so that ~ ¢ H. Worse still, if ~ were an eigenvector of X with 
eigenvalue a we should have 


(Xq)(x) = anp(z), (6.38) 


a (x — a)p(x) = 0. (6.39) 


This forces = to vanish except on the single point {a} and so 


wl? = 3 (D(x) ? dr = 0. (6.40) 


(Intuitively the area under the graph of a function that vanishes at all 
but one point is 0; those acquainted with the Lebesgue integral can make 
this more precise by observing that {a} is a null set.) Since eigenvec- 
tors are, by definition, non-zero vectors, this shows that X like P has no 
eigenvectors in H. The fact that neither P nor X can be measured with 
arbitrary precision will be strengthened in the next chapter into the famous 
‘Heisenberg uncertainty principle’. (There is, in fact, a generalization of 
the finite-dimensional spectral theorem valid for self-adjoint operators on 
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Hilbert spaces, but it is an altogether deeper and more subtle result, and 
we shall not discuss it further.) 

Given a self-adjoint operator, A, in an infinite-dimensional space, it is 
useful to define its spectrum to be the set of complex numbers a@ for which 
A—aris not invertible. (The notion of invertible is not quite straightforward 
either, but our discussion will not be sensitive to the precise details.) Every 
eigenvalue is in the spectrum, because if (A — a)w = 0, then A— a is not 
one-one and so not invertible. In finite dimensions the rank-nullity theorem 
enables one to show that the converse is also true but in infinite dimensions 
this breaks down. For example, (X — a)#(r) = (x — a)w(x) and so any 
function in the range of X — a must vanish at x = a. This means that 
X —a is not onto and therefore not invertible for any a, so the spectrum of 
X is the whole of R, whereas we have seen that it has no eigenvalues. The 
spectrum contains information about the scattering as well as the bound 
states. It can be shown that the spectrum of A is always a closed subset 
of R. This can lead to some counterintuitive properties. Consider the two 
Hamiltonians H; = P?/2m+ $mw32? for j = 1,2. Their difference H, — Ho 
has eigenvalues given by the energy differences, (n, -+ 4) hw, —(n2+4)hwe. 
A theorem of Kronecker says that if w;/we is irrational this set is dense 


in R. The spectrum, which must contain these values and: is also closed, 
must be the whole of R. 


Example 6.3.1. We conclude this section by calculating the mean and 
dispersion of the momentum in the n-th eigenstate of the one-dimensional 
square well. We recall that 


Wn(z) = V2/asin(nr2/a) (6.41) 


for x € [0,a] and vanishes elsewhere. Now for any real normalized wave 
function 7% we have 


Ev(P) = (WP dn) 
hs? ; 
=F [ vale)w'@) ae 
= FWv(@)%s 
=0, (6.42) 


which we should expect anyway by symmetry. 
This tells us that the dispersion is given by 


Ay(P)? = Ey(P?) — Ey(P)? = Ey, (P?). (6.43) 
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Since the Hamiltonian within the well is given by H = P?/2m this can also 
be expressed as 
Ay(P)? = 2mEy(#), (6.44) 
and, taking » = w%, which is an eigenfunction of H with value E,, we 
obtain ee 
nn h 
Ay, (P)? = 2mEy, (H) = 2mE, = a (6.45) 


6.4. Time evolution in quantum theory 


So far our discussion of the abstract mathematical structure of quantum 
mechanics has not concerned itself with the dynamical questions of how 
the systems change in time. In order to address this deficiency we must 
allow the vector describing the state to depend on time. Let us write 
for the state at time ¢. 


Assumption 6.4.1. The vector 1); satisfies the abstract Schrédinger 
equation 


d 
thee = Hy, 


where H is the relevant Hamiltonian operator. 


Remark 6.4.1. When wy is a wave function in one or three dimensions 
then this clearly reduces to the differential equation introduced in Section 
2.1. 


Theorem 6.4.1. The abstract Schrédinger equation has the formal 
solution 


» nt 
we = exp (-;/ iat) wo. 


Proof. Assuming (as is true in this case) that the exponential of an 
operator can be defined and has the usual properties, then 


S(o(i L'a») -£ (coli) 
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i ft dy, 
+em(> / at) Te 
5 t d 
= exp € [ Hat) [pum + | (6.46) 


By Schrédinger’s equation the expression in square brackets vanishes and 
so exp(i fis H dt/h); is a constant. Since when t = 0 it is Wo, we obtain 
the formal solution given above. Oo 


In terms of U; = exp(~i iy H dt/h) we may write y, = Uso, and the 
abstract form of Schrédinger’s equation can be recovered by differentiation. 
When # is constant this simplifies further to give U; = exp(-iHt/h). In 
this case, since H is self-adjoint we have 


Uf = elt /h — et Hh — Uy, (6.47) 
As with normal exponentials U_; is the inverse of U;, so that 
UfU;, = U_1.U, = Uo =jl]= ULU;, (6.48) 


which shows that U; is unitary. This means that 


IIvell? = Deroll? = [Ivoll?, (6.49) 


and confirms in the abstract setting the finding of Proposition 2.5.2, that 
the time evolution does not change the normalization of w. In fact, with the 
appropriate technical assumptions, the above arguments can all be made 
rigorous. 


6.5. Measurements in quantum theory 


Measurement occupies a much more important place in quantum mechanics 
than it did in the older theories. In classical physics it was assumed that, 
by taking sufficient care, disturbances to the system during a measurement 
could be kept below any given level of tolerance. In quantum theory that 
is certainly not the case. 

Suppose, for example, that one measures an observable A and finds a 
precise value a for it. It could have started in any state ~ for which the 
probability |(~|~.)|? for obtaining this value is not zero. However, after 
the measurement one knows the exact value so there is zero dispersion, 
and, by Proposition 6.3.2, the state must be described by an eigenvector 
Ya corresponding to a. The effect of the measurement has been to change 
the wave function from ~ to ,, or some multiple of it. 
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Of course, the measurement may only determine a range of possibilities 
rather than a precise value. In that case there are a number of possible 
eigenvectors, and the most that can be asserted after the measurement is 
that the vector representing the state should lie in the subspace that they 
span. One simple way of ensuring this is to make the following projection 
postulate, due to von Neumann and refined by Gerhart Liiders. 


Assumption 6.5.1. Let Q be the orthogonal projection onto the 
subspace spanned by the possible eigenvectors of A consistent with 


the outcome of the measurement. Then after the measurement the 
state vector has changed from w to Qi. 


Unlike the continuous time evolution described by Schrédinger’s equa- 
tion, a measurement usually diminishes the norm of the wave function, 
since ||Qz||? defines the probability of the outcome. 


Example 6.5.1. Suppose that a is a non-degenerate eigenvalue of A and 
that we is a normalized eigenvector associated with it. The projection, PA, 
onto the one-dimensional space that it spans, is then given by 


PAY = (bald) ba (6.50) 


and this will represent the effect on the state of measuring the value a. 
Thus the projection postulate not only asserts that # has changed to a 
multiple of %, but also tells us what multiple. In fact, that multiple is 
determined by 


PA dF = bal) vall? = |(delb)I?, (6.51) 


so that, if ~ is normalized, then ||P4y||? is just the probability of measuring 
the value a. This argument can be extended to show that for more general 
projections one gets just the probability of obtaining a value in the given 
range. 


Definition 6.5.1. The transition probability between two normalized 
vectors ¢ and w in H is 


7(,) = |(dlv)/?. 
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We can then write the probability of measuring the non-degenerate 
eigenvalue a as T(,q). From the earlier discussion this is also the prob- 
ability that the state will change from # to wa during the measurement, 
which is the reason for calling it a transition probability. It can also be used 
during ordinary time evolution: for example, the probability that after time 
t the system is still in its original state Wo is T(z, Wo) = [(wiltc)|?. It is 
worth noting that r is symmetric so that the probability of a change from 
wy to @ during a measurement is the same as the probability of a transition 
from ¢ to w. 


Exercises 


6.1 Show that if A is a linear operator on H such that (y|Aw)/||p||? is 
real for all states ~ then A must be self-adjoint. 
(Hint: Consider vectors p + c@ for c € C] 


6.2 Let (r,6,@) be spherical polar coordinates for a particle in R3. Show 
that —ih@/Or is not a self-adjoint operator on the wave functions, 
but —if(8/8r + 1/r) is self-adjoint. 


6.3 Let P be the parity operator defined on wave functions on R. by 
(Pp)(x) = o(-2). 


Show that P? = 1 and deduce that the eigenvalues of P are +1. Char- 
acterize the eigenvectors in terms of even and odd wave functions. 


6.4 Show that the operator P of the previous question satisfies PP = 
—PP, and PV(X) = V(—X)P, for any function V. Deduce that, 
when the potential, V, is an even function P commutes with the 
Hamiltonian, H. Hence or otherwise show that, in this case, it is suf- 
ficient to consider eigenvectors of H that are odd or even functions. 
If the system is in a non-degenerate energy state show that the ex- 
pectation value of the position operator is 0. 


6.5° A quantum mechanical system with only three independent states is 
described by H = C%. The Hamiltonian operator is 


12 0 
H=fw{2 0 2 
0 2 -1 


Show that the eigenvalues of H are 3hw, 0 and —3fw, and find the 
corresponding eigenvectors. 
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At time t = 0 the system is in the state 
Yo= {0 
0 


Find the Schrédinger state vector ~, at a subsequent time ¢. Let 
pi, p2 and p3 denote the probabilities of observing the system in the 


0 0 
states {| 0], | 1], | 0 |, respectively. Show that 0 < p2 < 5 
0 0 i 


6.6° A two-state quantum system in a magnetic field B has the Hamilto- 
nian operator 


ee _eh Bs By - as 
a Qu \ Bi +iBo —B3 ; 


where pz is a constant. At time t = 0 the system is in the state 


described by the vector G . Show that the probability of its being 


y) at a time ¢ is 


in a state ( 1 


|B? - BP 2 (elBlt 
BP sin ey 


where |B|? = B? + B? + Be. 


6.7 Show that if ¥, satisfies Schrddinger’s equation and the observable A 
does not depend on t then 


iS (wel Ave) = (Wal (AH ~ ILA). 


6.8° A hydrogen-like atom with nuclear charge Ze is initially in its ground 
state when radioactive decay suddenly reduces the nuclear charge to 
Z'e. Show that the probability that a subsequent measurement of 
energy will find it in the new ground state is (2/22'/(Z + Z')/8. 
[The ground state wave function is given in Proposition 4.3.1. The 
formula f>°r" exp(—Ar) dr = n!A~ +) may be used without proof] 


i 
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7 The commutation relations 


In my paper the fact that XY was not equal to YX was very disagreeable 
to me. ! felt that thls was the only point of difficulty with the whole 
scheme. 


WERNER HEISENBERG, recalling his paper of July 1925 


7.1. The commutation relations 


Our results so far, in particular Section 3.4, strongly suggest that although 
it is possible to find formalisms in which either momentum or position are 
easy to handle, one cannot simultaneously manage both. The underlying 
reason for this is that the operators P and X do not commute. In fact they 
are related by 


(PXV)a) = FF (ov(a)) 


_A hd 
Fe gee 
= A u(x) + (XPY¥)(2). (7.1) 
In other words, 
PX = “14+XP. (7.2) 


Definition 7.1.1. The commutator, [A, B}, of two operators A and 
B is defined by 


[A, B] = AB— BA. 


In this notation, we have shown that [P, X] = —if1. In three dimensions 
a similar calculation shows that [P;,X;] = —ihd6,,1. On the other hand 
multiplications by coordinates commute with each other, and so do partial 
differentiations (at least for the sort of well-behaved wave functions that 
we are using). These results can be summarized as follows: 
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Theorem 7.1.1. (The canonical commutation relations) In 
one dimension the position and momentum operators are related by 


i 


In higher dimensions the relations are 


h 
[P;, Px] = 0, [P,, Xx] = i gk, [X3, Xe] = 0. 


These commutation relations strongly resemble the classical Poisson 
bracket relations for generalized coordinates and momenta. Recalling that 
the Poisson bracket is defined by 


Of Og Of Og (7.3) 


we have 
{p5,te} = 56, (7.4) 


{P), Pr} = 0 = {2;, 2x}. (7.5) 


The following properties of the commutator also parallel those of the Pois- 
son bracket. 


Proposition 7.1.2. For all operators A, B and C the commutator 
satisfies the following identities: 

(i) {A, B] = —(B, A]; 

(ii) [A, B] is linear in both A and B; 

(iii) [A, BC] = B[A, C] + [A, BIC; 

(iv) (the Jacobi identity) [A, [B, Cl] + [B, [C, A]] + [C,[A, B]] = 0. 


Proof. (i) [A,B] = AB- BA =-(BA -— AB) = —[B,Al. 
(ii) For any complex numbers @ and 7, 


[A, BB + yC] = A(BB + yC) — (BB+ yC)A 


= B(AB — BA) +(AC — CA) 
= BIA, B} + yA, C}. (7.6) 
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Linearity in the first variable follows similarly, or by use of (i). 
(iii) 
[A, BC] = ABC —~ BCA 
= (AB — BA)C + B(AC — CA) 
= BIA, C] + [A, BC. (7.7) 


(iv) Finally: 


[A, [B, Cl] = [4, BC] ~ [A, CB} 
= B[A,C] + [A, B]C — C[B, A] — [A,C]B 
= [B,[A, Cl] + [[A, B],C] 
= —[B,{C, A]] — [C, [A, B]], (7.8) 


from which Jacobi’s identity follows on rearrangement of the terms. a) 


Remark 7.1.1. The similarity with Poisson brackets led Dirac to suggest 
that each function, f, on the classical phase space should be replaced in 
quantum theory by an operator, Q(f), in such a way that for any pair of 
functions f and g one had 


[Q(f), Q(9)] = -éhQ({F, 9}). (7.9) 


This idea has served as the inspiration for most of the mathematical in- 
vestigations of quantization. It is, however, now known that this identity 
cannot be satisfied for all functions simultaneously without violating some 
of the other conditions of quantum theory (see Exercise 7.19). This can 
lead to ambiguities when one has to decide whether the quantum analogue 
of a classical observable such as px” should be PX, X?P, X PX or some 
combination of these. Fortunately one can arrange that 


[Q(F), Q(g9)] + sh(Q{F, 9}) = O(F), (7.10) 


so that for most practical purposes any differences between the answers 
obtained is likely to be small. In any case, physicists now have sufficient 
experience of quantum mechanics that they no longer have to start by 
considering the corresponding classical system first. 


7.2. Heisenberg’s uncertainty principle 


The non-commutativity of P and X has some profound consequences that 
we shall now start to investigate. 
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Lemma 7.2.1. Let A, B and C be self-adjoint operators such that 
[A,B] = iC. Then 

(i) For all real t, (A — itB)*(A — itB) = A? +tC + t?B?. 

(ii) For any normalized vector ~ € H, 


(A — eB) yl? = Ey (A?) + ty (C) + PEy(B?). 


(iii) For any vector  € H, Ey(A?)Ey(B?) > ZEy(C)?. 
(iv) There is equality in (iii) if and only if there exists a real number 
t such that (A — itB)p = 0 or, equivalently, (A? + t?B?)p = —tCy. 


Proof. (i) The adjoint of (A — iB) is (A + itB) so that we have 


(A —itB)*(A — itB) = (A+ itB)(A — itB) 
= A? —it(AB — BA) + t?B? 
= A? +tC +t7B?. (7.11) 


(ii) For convenience we shall work with a normalized vector . Then by 
6.3.1(iv) we see that 


Ey(A?) + tEy(C) + tEy(B?) = Ey(A? +tC + t7B) 
= Ey((A — itB)*(A — itB)) 
= ||(A — tB)ol)?. (7.12) 


(iii) From (ii) it is clear that Ey(A”) + tEy(C) + E,(B?) >.0. 

The quadratic expression Ey(A”) + tEy(C) + t?Ey(B?) is non-negative, 
so its discriminant, 

Ey(C)? ~ 4Ey(A”)Ey(B7), (7.13) 

must be non-positive, and this gives the inequality (iii). 

(iv) Inequality occurs in (iii) when the discriminant vanishes, which is 
equivalent to the quadratic having a repeated real root t. On the other 
hand, since 


Ey(A?) + tBy(C) + PEy(B?) = ||(A—atB)ol?, (7-14) 


t is a root of the quadratic if and only if (A — itB)y =0. By (i) this same 
value of t¢ also satisfies 


(A? + t?B?)y = -tCy. 
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Corollary 7.2.2. (Heisenberg’s uncertainty principle) The dis- 
persions of the position and momentum are related by 


Ay(P)Ay(X) > 5h. 


The lower bound is achieved if and only if 


W(e) =e (~ Fee - 4)? +2) 


for some positive real constant ¢ and complex constants pz and a. 


Proof. Set A= P—E,(P) and B= X — Ey(X). Then, since multiples 
of the identity commute with all operators, we have 
[A, B] = [P, X] — Ey(X)[P, 1) — Ey(P){1, X — Ey(X)3] = (P, X] = —1, 
(7.15) 
so that C = —f1. Using Definition 6.3.2, the lemma then gives 
Ay(P)?Ay(X)? = Ey(A?)Ey(B?) > 4n?, (7.16) 


from which the desired inequality now follows on taking positive square 
roots. 


For equality, we need a real ¢ such that Ay = itBy, or 


Py = Ey(P)b + it[X — Ey(X)1]y. (7.17) 
For convenience we write \ = Ey(P) — itEy(X), so that 
hk 
= (\ + itz)W(z). (7.18) 


Integrating we obtain 


' . 2 
In(p(z)) = 4 (= + dz) +B, (7.19) 
for some constant £, so that 
ta? id 
w(x) = exp (-F tant 6) . (7.20) 
Finally we note that # will not be normalizable unless ¢ is positive. Set- 


ting uw = id/t and a = § — \7/2ht, we arrive at the stated form of wave 
function. Oo 
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Remark 7.2.1. Heisenberg’s uncertainty principle tells us that, in any 
state w, there is a lower bound to the product of the dispersions of P 
and X, so that greater precision in a measurement of position can only 
be bought at the expense of less precision in the measurement of momen- 
tum and vice versa. Soon after he had discovered this result, Heisenberg 
suggested the following physical model to enhance its plausibility. If we 
wish to determine the position of a particle very accurately using, for ex- 
ample, some form of microscope then we must work with light (or other 
radiation) of a correspondingly short wavelength, since distances cannot be 
resolved to better than about half a wavelength: Ay(X) ~ 3 (wavelength). 
On the other hand photons carry a momentum f/(wavelength). Since a 
photon must collide with the particle that we are observing if we are to 
see it at all, an unknown fraction of this momentum may be transferred 
to the particle giving Ay(P) ~ fi/(wavelength) ~ $%/A,(X). Although 
this example provides a rather striking image, it does rather suggest that 
the uncertainty in P arises only after our observation of X, whereas the 
uncertainty principle actually refers to dispersions in the same state at the 
same time. 


Definition 7.2.1. The states for which Ay(P)Ay(X) = 5h are 


called minimal uncertainty states. 


The ground state of the harmonic oscillator, 


1 
mw\t _ 
o(x) = (=) e mua? /2h (7.21) 
is a minimal uncertainty state obtained by taking ¢ = mw and p = 0. 
This is no coincidence, because if yp = Ey(X)+iEy(P)/t vanishes the final 
equivalence in Lemma 7.2.1 tells us that 


(P? + mw? X?) yb = mwEy(hl)y = mwhy, (7.22) 
Be peu A i 
cnn 2 =a 

(5 + 5mw x?) p= shud. (7.23) 


7.3. The time—energy uncertainty relation 


In the theory of relativity, space and time are linked, and so are momen- 
tum and energy. One would therefore expect a relativistic theory to have 
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an uncertainty principle for energy and time, as well as for position and 
momentum. There are indeed such relations, well known in-areas, such as 
radar signal processing, that study time-dependent wave phenomena, but 
they have a slightly different aspect in quantum mechanics. In part this 
stems from the fact that we cannot simply define a ‘time operator’ to be 
multiplication by ¢. (At any given time this would just be multiplication 
by a constant and could be normalized away.) However, the time-energy 
uncertainty reveals itself in other ways. For example, the time evolution 
of an energy eigenstate is given by multiplication by exp(—iEt/h). Since 
the physical state is unchanged by such multiplications, that state lasts 
for ever. This is a special case of a more general relationship between the 
lifetime of a state and the precision with which its energy can be known. 


Lemma 7.3.1. Let H be the Hamiltonian of a system and y, denote 
the state at time t, where initially ~p = ~, and assume that (wip) is 
continuously twice differentiable in near t = 0. Then for small t we 
have 


(ids) = exp (~FE (ED ~ Spo(H2)?) [1 +0(0) 


where o(t?) denotes a term that tends to 0 faster than ¢? as t — 0. 


Proof. We may as well assume that w is normalized, and we shall drop 
the suffix ~% from E and A. We first note that 


4 a 
ihe, exp(itE(H)/h) (ble) = exp(itE(H)/f) [(b| be) — E(H) (Wve) 
= exp(itE(H7)/h) (| [7 — E(H)] y). (7.24) 


pe vanishes when t = 0, since E(H) = (¥|Hy). Differentiating again, we 
obtain 


2 
—n? exp it (H)/n) (le) = exp(itl(E)/n) (6h HT ~ ELEN)? ve), 
(7.25) 


(b|[H# — E(H))’ b) = A(H)? (7.26) 


which reduces to 


fdeaunan v amsmstsnstoes ON dau sabant nate assassinate ebateat inten cna ae ast ag ssi st coh eo eat 
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at { = 0. From these two results the product rule can be applied to show 
that the first and second derivatives of 


exp (Fecen + F,aH)?) (wh (7.27) 


both vanish at t = 0. Since this function takes the value 1 at t = 0, Taylor’s 
theorem gives the identity 

it t? o _ . eas 

exp { ,E(H) + opp O) (ler) = 1+ oft"), (7.28) 


from which the result follows. o 


Corollary 7.3.2. The probability that the state 7 will not change 
within a short time ¢ under the evolution given by H is 


exp (- (tAy(H) /n)?) [1 +0(¢?)]. 


Proof. Using the lemma we see that the probability that there will be 
no change of state is 


t2 
(wld? = exp (— Fe Ay(H)?) 0 +0(@)) 0 
The exponential term remains bigger than 3 unless 


tAy(H)h > /in(2), 


and this shows that the timescale on which the state changes is inversely 
proportional to the variance of the energy. 


7.4. Simultaneous measurability 


It is not only position and momentum or energy and time which are gov- 
erned by an uncertainty principle; there are many other such pairs of com- 
plementary observables to which Lemma 7.2.1 can be applied. One can 


as 
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even apply it in the case of commuting observables A and B with C = 0. 
However, in that case it gives rise to the trivial inequality 

Ay(A)Ay(B) > 0, (7.29) 
and it is not usually possible to achieve the lower bound. That happens 
only if one of the two dispersions vanishes, and by Proposition 6.3.2 that 
can only occur when y is an eigenvector of the relevant operator. Ideally we 
should like 7 to be an eigenvector of both A and B so that both observables 
could be measured precisely. However, we have already seen that in infinite- 
dimensional spaces even self-adjoint operators need have no eigenvectors 
at all. Nonetheless, the following generalization of a well-known finite- 
dimensional theorem (proved in Appendix Al) provides a useful condition 
that is sufficient to ensure a good supply of states that are simultaneously 
eigenvectors of two commuting observables. 


Proposition 7.4.1. Let A and B be self-adjoint operators on the 
inner product space H, let 714 and 7, denote the subspaces spanned 
by eigenvectors of A and B, respectively, and let 14,5 denote the span 


of the vectors that are simultaneously eigenvectors for both A and B. 
If AB = BA then Hap = HaN Hg. 


This has the following immediate corollary: 


Corollary 7.4.2. If AB = BA and H admits an orthonormal basis 


of eigenvectors for A then H4,5 = He. 


Proof. Since H admits an orthonormal basis of eigenvectors for A we 
have H4 = H, so that 


Hap =HOAHB = He. ia] 


This result means that instead of looking for vectors that are eigen- 
vectors of B alone we may as well look for vectors that are simultaneously 
eigenvectors of A and of B since every B-eigenvector is in the span of these. 

This inspires the following definition: 


t 
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Definition 7.4.1. The observables corresponding to commuting op- 


erators are said to be compatible or simultaneously measurable. 


7.5. The harmonic oscillator revisited 


The last sections have emphasized the algebraic structure of quantum the- 
ory, so it is natural to ask whether one can solve real quantum mechanical 
problems without resorting to differential equations at all. In fact, for many 
elementary systems this is the case. 

As an illustration we look at the harmonic oscillator. The Hamiltonian 
operator is 


— 2 pe, 1, aye 
H= rai + 5 mw Xx, (7.30) 

so that we should like to find # and w satisfying 
(= =—P? + + 5mu?x?) y= Ey, (7.31) 


where P and X are self-adjoint operators such that [P, X] = —7h1. 

As remarked at the end of Section 7.2, the ground state of the harmonic 
oscillator is a minimal uncertainty state with t = mw and p = 0. In that 
case Ey(P) and Ey(X) vanish so that A = P, B = X, and the operator 
A—itB simplifies to P —imwX. 

Now, in the classical Hamiltonian treatment of the oscillator p + imwz 
has particularly simple equations of motion (see Exercise 7.18), so this 
suggests that we would do well to study the operator 


~ = P-imwX (7.32) 


and its adjoint 
a, =a" = P+imwxX. (7.33) 


Lemma 7.5.1. The operators a. and a+ satisfy the equations 
ahaz = 440% = 2m (H F $hw). 


Moreover, for any normalized vector 3 € H 


llazpl? = 2mEy (HF fiw). 
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Proof. Lemma 7.2.1(i) gives 


(P = imwX)*(P FimwX) = P? +m 22x? = muh 
= 2m (H ¥ $hw). (7.34) 


The formula for ||az%l|? likewise follows from the second part of the same 
lemma. Bo 


Corollary 7.5.2. Let ~ be an eigenvector of H such that Hy = Ew. 
Then 


(i) lasv = = 2m (E+ hw) |p|? 


(ii) E> 
(iii) B= ts if and only if (x) = %o(z) = Aexp(—mwa?/2h). 


Proof. (i) Since for w an eigenvector we have Ey(H) = E, part (i) follows 
from the preceding lemma. 
(ii) From the first part it is clear that 


E+ thw > 0, (7.35) 


so that 
E> +hhw. (7.36) 


The sharper inequality now gives the result. 
(iii) Bearing in mind the identity (i), we see that equality occurs if and 
only if ||a_~|] vanishes, that is if and only if 


(P —imwX)p = 0. (7.37) 


This is precisely the equation that we solved in Corollary 7.2.2 to get the 
minimal uncertainty states. Substituting the known values t = mw and 
pL = 0 now gives the result. Qo 


We are thus able to obtain the ground state energy and wave function 
quite painlessly by purely algebraic techniques, and we shall now go on to 
consider the higher energy levels. The first step is to show that the opera- 


tors a; and a_ respectively raise or lower the eigenvalue of eigenvectors of 
AT by fw. 
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Lemma 7.5.3- The operators az satisfy the equations 


Has =a (H+ hw), 


or equivalently, [H, a1] = thwaz. 


Proof. According to Lemma 7.5.1 


2m (H = Ahw) as = (an0q) a4 = a4 (a¢a4) = 042m (H + phw) . 


(7.38) 

Simplifying we obtain 
Hax = a4 (H + hw), (7.39) 
from which the state’! commutation relation immediately follows QO 


Corollary 7.6.4. Tf Hy = Ey, then for N = 1,2,..., 


Hol = (E+ Nhw)ah yp. 


Moreover, if # # 0 then al # 0, and aN yp Vanishes if and only if # 
takes one of the values shw, 3hw, $hiw,...,(N — 9) hw. 


Proof. Since the lemma already furnishes the case of N = 1, let us 
assume inductively that the result is true for N. Then, by applying the 
operators of the lemma to al, we obtain 


Halt yp = a4(H + hw)allp 
= a, |(E + Nhw) + hol ah 
=(B+(N + 1)hwlatt*y, (7.40) 


as required for the inductive step. Moreover, by the inductive hypothesis 
al! does not vanish, and by Corollary 7.5.2(i) and (ii 


aN ttyl? = 2m (E+ iw) ia pl? = 2mhu|lay pl? # 0. (7.41) 
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if aN tly, vanishes then a' wp is in the kernel of a_, and so, by Corollary 
7.5.2(iii), it is either 0 or it is a multiple of wo and its energy, E — N. hw, 
must be 4fw. In the latter case the energy is therefore B = (N + 4 )hw, 
whilst in the former case the inductive hypothesis tells us that F takes one 
of the values thw, 3hw,...,(N— + hw. , oO 


Theorem 7.5.5. The set of eigenvalues of H = P?/2m + $mw?X? 
is 


{(N +4) hw: N =0,1,2,...}. 


The eigenvectors corresponding to the eigenvalue (N + 4) fiw are the 
nit ples of 
dn = al yo, 


where 


1 
) 4 en mu? /2h 


Proof. We know that wo is an eigenvector of H with eigenvalue dhw, 
so Corollary 7.5.4 tells us that aly Wo is an eigenvector of H with value 
(N + 3)fw. Conversely if w is an eigenvector with value E then a" wp is 
either 0 or an eigenvector with eigenvalue E — nfw. For n larger than 
E/fw, FE —nfw is negative and so excluded as an admissible eigenvalue by 
Corollary 7.5.2(ii). We therefore conclude that 2% must vanish and, by 
Corollary 7.5.4, E takes a value of the required form. 

Finally, if ~ is any eigenvector with eigenvalue (N + 3) hw then aN y 
is an eigenvector of eigenvalue hw, and so is a multiple of wo. Since the 
same applies to jv we conclude that aNw = AaNwy for some constant 
A. But now, unless it vanishes,  — Ayn is an eigenvector with eigenvalue 
(N + 5)fw that is outside the range of energies for which aN (p — Avn) 
can vanish according to Corollary 7.5.4. To avoid a contradiction we must 
have w = Ayn, showing that the eigenvalues are non-degenerate and that 
every eigenvector is a multiple of the appropriate py. oO 


7.6. Uniqueness of the commutation relations 


Looking back at the arguments of the last section the only place in which 
any use was made of the explicit form of the operators P and X was to 
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establish that the E = 4fw energy level was spanned by a single wave 
function 19, and derive the formula for wo. Elsewhere we used only the 
commutation relations between P and X. We could therefore have replaced 
the explicit use of P and X by the algebraic assumption that there exists 
an eigenvector 2 for the energy level E = }fw, which is unique up to 
multiples. Equivalently, by virtue of Corollary 7.5.2(iii), we could have 
assumed that ker a_ is a one-dimensional space spanned by 2. 

In fact this extra assumption is sufficient to determine the algebraic 
structure completely, as we can then map the abstract space 1 across to 
the usual space of wave functions by 


Soca (+) 23 Sent. (7.42) 
n=0 


n=0 


In other words when ker(P — imwX) is one dimensional it is only pos- 
sible to satisfy the commutation relations on a space that is isomorphic to 
Schrédinger’s space of wave functions. Under mild technical assumptions 
the above argument can be made rigorous and leads to a result known as 
the Stone-von Neumann uniqueness theorem. It brings the assurance that 
there is nothing in the detailed theory of wave functions that cannot, in 
principle, also be achieved by the algebraic approach, since they work in 
isomorphic spaces. 


7.7. A generating function for the oscillator wave functions 


The algebraic approach to the harmonic oscillator described in the last 
section greatly simplifies the task of finding the wave functions. It enables 
one to find the ground state by solving a first-order differential equation 
rather than the second-order Schrédinger equation, and then provides the 
explicit formula yy = (ia;/h)% yo for the wave functions corresponding 
to any higher energy level. That formula can be simplified still further by 
some elementary calculations. 


Lemma 7.7.1. The operator a, can be written as 


ae os ema? /2h fom? /2h 
h dz 


1 
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Proof. We first note that 
L a ; d omy 
sO4 = 2(P +imwX) =| —-——=2}. (7.43) 
Now, for any differentiable function ¢ we have 

d mw mnie? d 2 

Sooty oe ees eS wae" {2h f —mwex? /2h 

( —— x) =e = (c ¢) (7.44) 

Combining these we deduce the operator identity 


Leer 2/07 @ mwa? /2h 
Rot = hme in’ ia le hak (7.45) 


as required. Qo 


Corollary 7.7.2. For N = 0,1,2,... we have the identity 


a N 
wn(z) = (=) 4 pmwa? /2h d en me? / hy 


Th daN 


Proof. This follows from a simple calculation using the identity in the 
preceding lemma: 


i N 
Yn(z) = (jos) Po 
es Cases zee) o vo 


N 
a ema? /2h a en mun? /2y, 
dxN 
2 N 
_ (=) 4 jmwa?/2h d 


= 2 
— ory =, mws [he gO 


e 


Rather than dealing with the individual wave functions it is more effi- 
cient to combine them. 
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Definition 7.7.1. The generating function for the harmonic oscilla- 
tor wave functions is defined by 


G(s,z) = > 7 bn (z). 
N=0° 


Using our earlier formula for #y we can obtain an explicit formula for 


G. 


Theorem 7.7.3. For all real values of s and x the generating function 


| 
| 
| 
is given by 


Proof. By the definition of G and the explicit formula for ww we have 


oo gN 
G(s,2) = 7) Sen (2) 


n=0 
co ON WN 
mw 4 212% s\ d —mwx?/h 
aa) mae ee 
n= 


By applying Taylor’s theorem to the analytic function exp(—mw2?/h) we 
may sum the series to obtain 


G(s, 2) = (=) emu? /2h —mu(2+8)"/h 


= (a) exp |- ( + 2s + 5**)| : Oo 


The generating function provides a very economical way of calculating 
normalizations and expectations, such as 


oo 8 tM = 252 
D>, Whar) = A G(s, 2)G(t,2) da 


= (™y} [ex (Se + 2x + t? + Qta + 2)) dz 


a (Bat) eamoctin iE exp (-"2 (+9 +t)*) de. (7.47) 
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On substituting u=2+s5-+t and doing the integration we find that 


= oN iM ed 2mwst \% 
er cee De (=) + 
M.N=o N! M! PD N! h 


On comparing coefficients we see that 


N 
(Wrbae) = Sen! (7) (7.49) 


confirming the orthogonality of the eigenvectors and providing the appro- 
priate normalization. 


We may also use the generating function to prove the following useful 
result: 


Theorem 7.7.4. If the wave function w is orthogonal to wy for all 
N >0 then » =0. 


Proof. Since G(s,z) is defined by a power series it can be extended to 
complex values of s, and we have 


G(a + ia,z) = (=) ; exp |-= (52 +2(a 4+ ia)z + (a+ ia)?) | 
= exp (-= [a(x + a) — o]) G(a, =). (7.50) 


Equation (3.39) for the inversion of the Fourier transform, (3.38), can be 
applied to a product of functions to give 


F(x, 2)0(2) = 1. [PO Fe, vw) ade, (7.51) 
Qrh R2 
Setting F(x, y) = G(a,r)G(a, y) and substituting p = 2mwa: then gives 
emmua(s—w/hG(a, z)G(a,y)%(y) dyda, 
| e?mwe"/h Ga + ia, 2)G(a + ta, y)(y) dyda, 
R 


= ah I en 2mnwa? RG g + ia, x) (G(a + ta) |p) da. (7.52) 
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If (nv) vanishes for all N then so does 


my) 
a-—twa 
(G(a + ia)|W) = > P= wiy), (7.53) 
which means that G(a,x)*¢(x) = 0, and the result follows immediately. 0 


This result tells us that there are enough harmonic oscillator wave func- 
tions to span the space (in the sense of the definition given in Appendix 
Al). This can be seen still more explicitly in Exercie 7.20. 


7.8%, Coherent states 


The algebraic approach to the harmonic oscillator may seem rather ab- 
stract, but for some applications it is actually closer to the physics than a 
differential equation. 

Laser light is usually approximately monochromatic; that is, it has a 
precisely defined frequency, w. Each component, ¢, of the electromagnetic 
field therefore satisfies the equation 


— = —w¢, (7.54) 


so that ¢ satisfies the same equation as a classical harmonic oscillator. 
Knowing that we are simply dealing with an oscillator it is easy to quan- 
tize the light wave, as, according to Planck and Einstein, we must. The 
energy levels will then be (N + 3) hw. The ground state energy hw is then 
interpreted as the energy of the vacuum before the beam has been turned 
on. Each successive level adds an energy fw, that is precisely the energy 
of one extra photon. Thus the energy level (N + 4)fiw is interpreted as 
having N photons added to the vacuum. Taking a vector ~ and forming 
a+ raises the energy by fw, and so can be regarded as the creation of 
one photon. Similarly a_, which lowers the energy, can be thought of as 
annihilating one photon. The characterization of the ground state as the 
vector satisfying a_ = 0 corresponds to the physical idea that the vacuum 
contains no photons for a_ to annihilate. 
This physical picture leads to the following terminology: 


Definition 7.8.1. The operator a4. is called a creation operator. 


Similarly a_ is called an annihilation operator. 
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The detailed theory of lasers has to explain the way in which the beams 
are generated and that involves a careful consideration of the interaction 
between matter and radiation that lies beyond the scope of this book. 
(The principles were first recognized by Einstein in 1916.) However, the 
beams themselves can often be described by the state C(s) obtained by 
normalizing the generating function G(s). We have already shown in the 
preceding section that 


(G(s)|G(t)) = emeet/h, (7.55) 
so that 
2 
]G(s)||? = emer /*, (7.56) 
and we may take 
O(s) = 7S" MG (5) = e-must/n > 5” 
= » ~ bn: (7.57) 


Definition 7.8.2. The state described by C(s) is known as a coherent 


state. 


The name derives from the fact that C(s) can be used to describe co- 


herent light, although in that case the parameter m would no longer be 
interpreted as a mass. 


Proposition 7.8.1. Coherent states satisfy 


C(s,x) = (Ze) * exp (me +23)2) 


(C(s)|C(t)) = exp (- "js - t7). 


Proof. The first assertion follows from the definition since 


C(s,x) = e-™87/A Gs, a) = exp (Fe + 28)*) : (7.58) 
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The second follows from 


(C(s)|C@) = exp (-*(6? +#)) (G()1G®) 
= exp (-S +e - 2st)) ; (7.59) 


which gives the stated result. ral 


As the exponential of a quadratic function of x, C(s) is a minimum 
uncertainty state. (If we allow complex values of s then Corollary 7.2.2 tells 
us that every minimal uncertainty state is a scalar multiple of a coherent 
state.) 

Being a sum over different pn, C(s) describes a state with no specific 
number of photons. The probability of finding n photons is the same as 
the probability that the energy is (n + 4)fw. Bearing in mind that yp is 
not normalized this is 


HdnlC(s)))? _ gn 2mwe?/h PniG(s))? 
Ierll? IlPrll? 
2n 2 
, en 2muws?/h |s| at! ; (7.60) 
We showed in the preceding section that 
2 mw \" 
inl” = nl | S— ) (7.61) 
so the probability reduces to 
2 Tt! 
a Ge ) akan (7.62) 


that is the mean number of photons follows a Poisson distribution with 
mean 2mws? /f. 


7.9*, Squeezed states 
It is quite easy to show that both position and momentum have zero ex- 
pectation values in the ground state, %, of the harmonic oscillator, and 


the energy is split evenly between potential and kinetic energy, so that 


Ey,.(P?/2m) = thw = Ey, (mw? X?) , (7.63) 


ad. 
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(see Exercises 7.6 and 7.7). Combining these facts we see that 

Ayo (P)? = Eyo(P?) = gmhiw = Ay (mwX)?, (7.64) 
so that the ground state not only achieves the minimum permitted by the 
uncertainty principle, but it distributes that uncertainty evenly between 


the position and momentum. When we take the ground state > of an 
oscillator with a different frequency, ¢w, we similarly obtain 
Ay, (P)? = Amhtw = Ay, ((mewX)?)’, (7.65) 

whence it follows that Ay,(P)? = imfh¢w and Ay, (mwX)? = dmh¢-lw. 
By choosing ¢ < 1 we may reduce the dispersion of P at the expense of 
increasing that of mwX, whilst retaining the minimal uncertainty overall. 
It is possible to prepare such states of light in the laboratory, and they are 
known as squeezed states, with squeezing parameter r = -i Ing. (It is also 
possible to work with complex values of ¢, but then one can no longer in- 
terpret ¢w as a frequency, the expression for r becomes more complicated, 
and the uncertainty is not minimal.) The state W¢ is not an eigenstate of 
the physical Hamiltonian, P?/2m + 4mw?X?, and so it changes with time. 
Initially the dispersion of P increases whilst that of mwX diminishes until 
the above values are reversed, and then the changes continue in a periodic 
cycle. Nonetheless this is entirely predictable and it provides a way of 
sidestepping some of the effects of the uncertainty principle. This is par- 
ticularly useful when transmitting signals with small numbers of photons, 
so that the uncertainty limits are important. 

It is quite easy to calculate the expected number of photons in the 
squeezed state. Bearing in mind that the energy is (n+ $)fw, we make the 
following definition: 


Definition 7.9.1. The number operator is defined to be 


N = (hw)71(P?/2m + mw? X?/2) — 3. 


Using our earlier expressions for the kinetic and potential energy, the 
expectation value of N is given by 


(Fw) “ME(P?/2m + muw®X?/2)— 3 = 4 (6 +07!) 3 


1 2 
= (a) _ (7.66) 


Unless ¢ = 1, this is positive showing that the squeezed states do contain 
photons. By a more careful analysis one can find the complete probability 
distribution for NV. 
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Exercises 
7.1 Let A and B be self-adjoint operators. Show that their commutator 


7.2 


7.3° 


7.4 


7.5 


7.6 


satisfies 
[A, B]* = -[4, B]. 


Hence or otherwise show that —i[A, B] is self-adjoint. 


Let X,, X2, X3 and P,, Po, P3 be the position and momentum observ- 
ables for a particle moving in three dimensions. Define the angular 
momentum observables L; = (X2P3 — X3P2), Le = (X3P, — X1P3) 
and Ls = (X1P2 — X2P,) and show that 


[Li, Lo] = thLs. 


A particle of mass m moves along the z-axis under the influence of 
a potential V(x) = 4mw?a?. Show that, if T = P?/2m is its kinetic 
energy, then 


Evie) 2 (“) 


Show that for a differentiable function, f, 
(i) [P, f(X)] = -éaf'(X); 
(ii) [XP, f(X)] = —#hX F(X). 
By considering Fourier transforms deduce that, for any differ- 
entiable function, g, 
(iii) [X, 9(P)] = thg’(P); 
(iv) [XP, 9(P)] = iAPo'(P). 
By taking g(P) = exp(—iaP/h) in part (iii) of the previous exercise, 
or otherwise, show that 


eitP/h Xe -1eP/h par X +a. 


Deduce that for any function f that can be expanded in a power 
series, ; 

eftPlh (Xe OF lt = F(X +a). 
Hence, or otherwise, deduce Weyl’s form of the commutation rela- 


tions: sie 
etaP/h eibX/h — gtba/hi p1bX/h pia / : 


Let ~ be an eigenvector with energy E for the harmonic oscillator 
Hamiltonian H = P?/2m + imw*X?. By considering E(ax) and 
E,,(a2.) or otherwise, show that Ey(X) and Ey(P) vanish and that 


Ey(T) = E,(V), 
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7.7 


7.8 


7.9° 


where T = P?/2m. Deduce that Ay(P)? = mE, and find the cor- 
responding expression for Ay(X). Verify the formulae for the ex- 
pectation values of T and V directly for the eigenstates of the one- 
dimensional harmonic oscillator. 

[The eigenstates 4, corresponding to the energy levels (n + 4) hw of 
the harmonic oscillator satisfy 


(Fe) xa = (B) vr (SEE) se 


Show that if Hy = Ey then for any operator A 


(p|[H, A}p) =0. 


Suppose that H = T+V where T = P?/2m and V = kX withka 
complex number. 

(i) By taking A = X show that Ey(P) = 0. 

(ii) By taking A = XP derive the virial theorem: 


2Ey(T) = NEy(V). 


(ili) Deduce that Ey(T) = NE/(N + 2), and find Ay(P). 

(iv) Show that the hydrogen atom potential is homogeneous of de- 
gree —1 and deduce that any bound state energy E is negative. 

(v) When V = $mw?X? show that 


Ay(P)Ay(X) = =. 


At time ¢ = 0 a quantum mechanical system is in a state for which 
both the expected values of position and momentum are 0. The 
Hamiltonian operator for the system is H = P?/2m. Prove that 


at a subsequent time t, Ay(P) has the same value as at t = 0, but 
that 


5 ((4y())?) = ~Ey(XP + PX). 


Hence, or otherwise, show that Ay(X)? increases quadratically with 
t 


[Hint The results of Exercise 6.7 may be used.] 


An operator a satisfies [a,a*] = 1 where a* is the adjoint of a. If 
the operator N = a*a show that N has for eigenvalues the set of 
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7,10° 


7.11 


7.12 


7.13 


7.14 
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non-negative integers. Show also that if each eigenvalue k is non- 
degenerate then it is possible to choose the corresponding normalized 


eigenvectors ~, in such a way that 
abe = VE + bess. 


Hence, or otherwise, derive the energy levels of the Hamiltonian H= 
P?/2m+4mw?X?. Derive the ground state and first excited state as 


functions of x. 


The operator a satisfies the relations 
a“ =0, a“a+aa* =1. 


The operator N is defined by N = a*a. Find (N, a) and [N, a", 
and show that N is a self-adjoint projection, that is N = N = N*. 
Obtain a matrix representation for a, a* and N in which N is diagonal. 


A charged particle moving in the plane perpendicular to a magnetic 
field B has Hamiltonian 


2 2 
i Naa 73) 
H = on (2 + teXaB) + (7 rae ) | 


Show that the set of energy eigenvalues is 


{(n41) HEP sn mo,a2,..}. 
2 m 


Use the generating function 


4 mu t <5 
G(s,z) = (=) is exp "2 G + 2sx + 3% ] 


to find the normalized wave functions for the first and second excited 
states of the harmonic oscillator. 


Show that 
a..C(s) = (2imws)C(s). 


Show that (7a4|Xww) vanishes unless |M—N| = 1, and for all N >0 
find (pw|Xdn) and (bn|X?dy) 
(i) by using generating functions, 
(ii) by writing X = (a4 — a_) /2imw. 
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7.15 Let G_(s,x) = exp(—4iwt)G(exp(—iwt)s, 2). Show that 


7.16 


7.17 


7.18 


h? 0? 
a oce = -Ee + jmuta? G. 
At time t = 0 the wave function for the harmonic oscillator is 


(0,2) = (=) : exp (-e _ a)?) . 


Find the wave function at time t. 


A particle moves on the z-axis in a potential, V, that is periodic of 
period a. (That is, V(x +a) = V(z) for all x € R.) Let T, denote 
the operator defined by 


(Ta) (x) = Y(x + a). 


Show that J, commutes with the Hamiltonian operator H. 

Show also that Tay = Aw if and only if \~*/%y)(z) is periodic with 
Period a. Deduce that the energy eigenstates of the form y(x) = 
exp(ika/a)¢(x), with ¢ periodic of period a, span the space of all 
energy eigenstates. 


Show using Lemma 7.3.1 that 


ei 2 
(Hide) = 1 +0) [eM exp (ae) dB. 


[This shows that for small times s the behaviour is though the energy 
were normally distributed with mean Ey(H) and variance Ay,(H)?.] 


Show that in classical Hamiltonian mechanics the equations of motion 
for a one-dimensional harmonic oscillator with potential dmw? x? can 
be written in complex form as 


<( + imwe) = iw(p + imwz). 
Deduce that 
[p(t) + imwa(t)] = e™*[p(0) + imwa(0)], 


and find an expression for the position at time t in terms of the initial 
values, p(0) and (0). 
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7.19 Suppose that there is a map, Q, from functions of p and x to operators, 
which satisfies [Q(f), Q(g)] = —#hQ({f,g}), and also that O(p) = P 
=X. 
a ree that Q(x*) — X* and Q(p*) — P* commute with P and 
X, for any k > 1. ; 
(ii) Show that 12{p%, 23} = {{p?, 27}, {a°, p?}}. ; 
(iii) Show that —12h7[P3, X°] # [[P°, X7], [X*, P’]]- 
(iv) Deduce the Groenewald-van Hove theorem that Q cannot ex- 
rou may assume that the only operators which commute with 
P and X are multiples of the identity. The results of Exercise 
7.4 may be useful. 


7.20 Show that for any wave function 7, one has 


eW 2mu(a?+a")/RG(a + ia, 2)(G(a + ia)|p) dada. 


8 Angular momentum 


If therefore the constant h of Planck has ... an atomic significance, It 
may mean that the angular momentum of an atom can only rise or fall by 
discrete amounts when an electron leaves or returns. 


JOHN NICHOLSON, In The constitution of the solar corona, 1912 


8.1. Angular momentum in quantum mechanics 


In many of the systems studied in classical mechanics angular momentum 
is far more useful than ordinary linear momentum. For example, in the 
study of planetary orbits, or indeed any motion under a central force, the 
conservation of angular momentum plays a crucial role. We shall therefore 
turn our attention to the description of angular momentum in quantum 
mechanics. 

The classical angular momentum of a particle is described by the vector 
x xp. This suggests that quantum mechanical angular momentum should 
be described by three operator components: 


Definition 8.1.1. The quantum mechanical orbital angular momen- 
tum vector, L, has for components the operators 


DL, = X2P3 — X3Pe, Ly = X3P; — Xi Ps, D3 = X1P2 - X2P\. 


By use of the alternating symbol 
1 if (jkl) is a cyclic permutation of (123) 
€jkl = 4 —1 if (jkl) is a cylic permutation of (213) (8.1) 
0 in all other cases, 


and the summation convention (that we sum over any repeated index over 
the values 1, 2 and 3), we may also write this as 


L; = jet XKPY. (8.2) 


The fundamental algebraic relations for the angular momentum are sum- 
marized in the following result. 
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Proposition 8.1.1. The angular momentum satisfies the following 
commutation relations: 


(i) [Ly, Pe] = thejar Pi 


(ii) (Lj, Xx) = tej Xt; 


(iii) (Lj, Le] = thejae Lr. 


Proof. (i) Using Proposition 7.1.2(iii) together with the commutation 
relations, 7.1.1, we have 


[XePe, Px) = (Xo, Pa] Pe + Xo[Pes Pe] = tone Pe. (8.3) 


So , ; 
(Z5, Py] = €jst[Xs Ft Py) = the; Sxl = thes Pr. (8.4) 


(ii) The formula for (Lj, Xx] is obtained similarly. 
(iii) Finally, 
(Ly, Lo] = (Li, X3Pi — X1 Pa] 
= X3[Li, Pr) + (Li, Xa] Pa — X1{L1, Pa] — [L1, X1|P3 
=0-ihXoP, +ihXiP, —0 
24hDy: (8.5) 


and the other non-trivial commutation relations between the components 
of L follow by permutations of the indices. Oo 


Tt is useful now to abstract from the above result the following more 
general idea: 


Definition 8.1.2. Any three operators Ji, J2, and J3 that satisfy 
the commutation relations 


(Jj Ju] = thegnt Ji 


are called angular momentum operators. 


Most of the following analysis works equally for orbital and general an- 
gular momentum operators. The commutation relations can be cast into a 
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useful coordinate-free form by introducing the angular momentum operator 
in the direction of the vector a 


a.J = 0101 + agd2 + agJ3. (8.6) 


Then we have in place of (iii): 


Proposition 8.1.2. 


[a.J, b.J] = ifi(a x b).J. 


Proof. The derivation of this is left as a simple exercise. oO 


8.2. Ladder operators 


The fact that the three components of angular momentum do not commute 
with each other is at first sight somewhat disconcerting, for this means that 
they cannot simultaneously be measured precisely, and yet we expected 
them to be constants of the motion in central force problems. To overcome 
this difficulty we need another related operator. 


Definition 8.2.1. The total angular momentum is defined by 


PaIi+ Ig + Is. 


This is more tractable as the following result shows. 


Proposition 8.2.1. Suppose that [J;, Ag] = ifejmAr, [Jj, By] = 
the;x1B1, and that A.B = A,B, + ApBo + A3B3. Then 


(Jj, A.B] =0 


for j = 1, 2,3. 
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Proof. With the summation convention we may write A.B as A,B, to 
obtain 


(J;, A.B] = [J;, An Bx] 
= Ax[J;, Br] + [J;, Ax] Br 
= Agihesn: By + then AB, 
= the je (A,B; + Ar Bx) 
=O; (8.7) 


where the last line follows from the fact that €;,; is antisymmetric in k and 
| whilst the bracketed term is symmetric. D 


Corollary 8.2.2. Each of the operators X*, P?, K.P, and L? com- 


mutes with L;, and J? commutes with J;, for j = 1,2, 3. 


This result, which is an immediate consequence of the last two proposi- 
tions, means that we can find simultaneous eigenvectors of J? and J3. Once 
J3 has been given an eigenvalue the relation [J1, Jo) = ihJ3 will closely re- 
semble the commutation relation between X and P, and this suggests that 
it might be useful to introduce some more operators: 


Definition 8.2.2. The ladder operators J4, are defined by 


Jy = Jy tide. 


Proposition 8.2.3. The ladder operators satisfy the following rela- 
tions: 

(i) [J?, Ja] = 0; 

(it) Je Jz = J? — (JP t Js); 

(iii) (Jy, J-] = 2d; 

(iv) J3J4 = Ji (J3 +h) or, equivalently, [J3, Jz] = thJx. 
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Proof. (i) The first part is a trivial consequence of Corollary 8.2.2. 
(ii) Applying Lemma 7.2.1 with A = J,, B = Jo, C = —hJ3, and t = +1, 
we have 


Je Ja. = J2 + JZ Fide 
= J? — (JZ hs). (8.8) 


(iii) This part follows immediately from the previous one: 


[J4, J-] = [J? — (Jy -hJs)] — [J? — (J2 + BJ) 


(iv) Finally, we have by direct calculation: 


[J3, Ja] = [J3, Jz + iJ] 


=thJg thd, 

= th( Ji + thJ2) 

= +hJy, (8.10) 
which completes the proof. a 


Corollary 8.2.4. Let w be a common eigenvector of J? and Js, 
satisfying J? = AR? and Jsb = mh. Then 
(i) J7Jab = AN Jay; 


(ii) JsJap = (m+ 1)hJay; 
(iii) [Ja vll? = [A — m(m & 1) IYI); 
(iv) A > m(m + 1) and A = m(m +1) if and only if Js = 0. 


Proof. We know from Corollary 8.2.2 that J? and Jz commute and so 
have common eigenvectors. We know also from part (i) of the preceding 
proposition that J? commutes with Js, from which the first part of the 
corollary follows immediately. 


Applying both sides of the operator identity 8.2.3(iv) to w we obtain the 
second identity. 


Since Jy. and J_ are adjoints we have 


ls? = (WlJz Ja), (8.11) 
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and from Proposition 8.2.3(ii) we deduce that 


|| Jz vll? = (bl [J? — (Jz + hJs)] v) 
= [A— m(m + 1)]h? |v ||?. (8.12) 


The inequality relating A and m follows immediately, together with the 
condition for equality. g 


8.3. Representations of the angular momentum operators 


We have now assembled the same kind of information about the angular 
momentum operators that Corollary 7.5.2 supplied for the creation and 
annihilation operators and we can now follow the strategy of Section 7.5. 
Just as with the energy of the harmonic oscillator, there are barriers that 
prevent us from changing the eigenvalues of J3 too much, but this time the 
inequality A > m(m+ 1) provides both upper and lower bounds, blocking 
the action of both J, and J_. As with the oscillator, this is the key to 
discovering those Jg eigenvalues. 


Theorem 8.3.1. The cigenvalues of J? have the form j(j + 1)h? for 
j = 0,4,1,3,2,9,--+ 


For each choice of j the eigenvalues of Jg are mh for m = -j, 
—j+1,...,j-1,j. The degeneracy of each eigenvalue is the same as 
that of jh. 


Proof. The inequality provided by Corollary 8.2.4(iv) is particularly 
transparent when written in the form A+ i > (m+ 4)?, and clearly shows 
that for a given A, the eigenvalue m is bounded above and below. Since, 
by repeated use of Corollary 8.2.4(ii), JN either vanishes or is another 
eigenvector with eigenvalue (m-t N), it is clear that for some non-negative 
integers p and 4 both Jetty and Jitly must vanish. We may ie well as- 
sume that these are the least integers with this property, so that Jip # 0 is 
an eigenvector of J3 with eigenvalue (m-+p)h and the identity Jy, (Ji) =0 
is the condition required in 7.2.4(iv) for the equality \ = (m+ p)(m+p+1). 
Similarly, from J-(J“) = 0 we deduce that > = (m — qg)(m—q+1). 

Combining these we see that (m+ p)(m+p+1)=(m—4)(m—q~1), 
which can be rearranged as 


2m(p+q+1)=(g-p)(p+qt)). (8.13) 
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Since p+q+1 is positive we deduce that m = (q—p)/2. We may therefore 
write m +p =j where j = (q+ p)/2 and deduce that 


d= (m+p)(m-+p +1) = 55 +1). (8.14) 


It is immediate from our definition that 7 is half of a non-negative integer, 
and that m.= j — p must differ from it by an integer. To see that each 
such value of m does occur in an eigenvalue of J3, we first note that 1); = 
J? provides an eigenvector with eigenvalue (m+ p)h = jh, and then 
that J2~™,; is an eigenvector with eigenvalue mf. One may check the 
degeneracies as for the harmonic oscillator. Oo 


Corollary 8.3.2. If there are no degeneracies then the eigenspace on 
which J? takes the eigenvalue j(j + 1)h? is 27 +1 dimensional with a 
basis of the form {tm € 7:0 < j —m < 29} such that 


J3bm = Mim, 


Jam = VG FM)G m+ 1)hbmsi. 


Proof. According to Corollary 8.2.4(iii), when Jy = j(j7 +1)h?w and 
J3) = mh, we have 


|J-vll? = [5G+1)-m(n—1)| A? lhl? = G-+m)G—m+1)A? pl. (8.15) 


Let %; be a normalized common eigenvector for which the J3 eigenvalue is 
jh. We may iteratively define 


: oy ee 
bj—q = (925 - G+) PA Ty a4) (8.16) 
for gq = 1,2,...,27, to obtain a sequence of normalized eigenvectors having 
each of the permissible eigenvalues for J3. Applying J, to this defining 
relation and using Proposition 8.2.3(ii) we obtain 
Jabj—q = (925 — q+ PAT? - JB + fda) j—-gh 
= [g(2j 9+ 1)]# Ad) a4, (8.17) 


from which the stated expression for the action of J, follows. The action 
of J_ is, by definition, as asserted. Oo 
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Spectacular demonstrations of this quantization are now available. At 
temperatures below 2.19 K liquid helium becomes a ‘superfluid’ and starts 
to exhibit quantum phenomena on a macroscopic scale. If a small bucket 
of the fluid is set spinning, vortices can be formed, each of which carries 
one quantum fi of angular momentum. 

We shall now show how Corollary 8.3.2 can be used to get explicit op- 
erators satisfying the commutation relations for angular momentum. 


Example 8.3.1. When j = 0 then 2j+1 = 1 and the space is spanned 
by a single vector wo such that 
Jeo =0, Javo=0, Jy = 0. (8.18) 


It is easy to verify that these operators do satisfy the angular momentum 
commutation relations, but for obvious reasons this is sometimes known as 
the trivial representation of those relations. 


Example 8.3.2. The next permissible value is j = }. For j = } the 
space is spanned by two vectors Ya = Py. h such that 


Japa = tohvs (8.19) 
bi OSTA (8.20) 
Jey ahh, Jy = hy. (8.21) 


With respect to the basis ~;, w— we have the matrices 


if 20 _,{01 ont © 
= 5n(G ai Je=A(p ae aie ae (8.22) 


Using the identities Jy = 4(J,+J_), and Jo = —3i(J4 —J_) we 


derive ee 1/0 ; 
—4 
n= in(s ae n= 5n() a) (8.23) 


Definition 8.3.1. The matrix representation of J,, Je and J3 given 
above is called the spin representation of angular momentum. The 
three matrices 


01 0 -i 1 0 
OLA gp PRIME Qs. FERN peas 


are called the Pauli spin matrices. 
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Using the Pauli spin matrices we may write the spin representation in 
the form J, = tho, for k = 1,2,3, and 


aJ = Prag = 1, ( ee es va : (8.24) 


Example 8.3.3. When j = 1 we have a space spanned by 27 +1 = 3 
vectors, ¥, = W41, Wo, and o_ = y~_1. Each of these is an eigenvector of 
J* with eigenvalue 2h”. According to our formulae 


Jab, = hy Jao = 0 Jsyp_ = -hy_ 
Jv = V2hio Jb = V2hp- J_py_ =0 (8.25) 
J44 =0 Jy = V2hb, Jb = V2. 


With respect to the basis #+,%%0, w_, the angular momentum operators are 
represented by the following matrices: 


0 V2 0 0 oO 0 
J,=h[0 0 V2], KL=Al v2.0 Of, 
0 0 


0 0 0 V2 
10 0 
Jg3=h1O 0 0 
00 -1 


In this form one can readily check that the commutation relations for the 
J are satisfied. In fact, we shall show in the next section that this is just 
a heavily disguised version of ordinary three-dimensional space. 


8.4. Orbital angular momentum and spherical harmonics 


Theorem 8.3.1 not only tells us which eigenvalues of J? and J3 are possible, 
but, as the examples show, its corollary also tells us how all the operators 
act on a particular basis of vectors. However, although this works well in 
an abstract inner product space, it is not always possible to realize the 
same operators on a space of wave functions. 


Proposition 8.4.1. For orbital angular momentum the parameters 
j and m in Theorem 8.3.1 must both be integers. 
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Proof. To understand the distinction between orbital angular momen- 
tum and the abstract algebraic situation that we studied in the previous 
sections we need to study the action of the angular momentum operators nti 
on wave functions in greater detail. In terms of spherical polar coordi- whose values lie in the two-dimensional spin representation space. 
nates (r,@,¢) the position vector is x = (r sin @ cos ¢,r sin 6 sin ¢,r cos @). It is also true that for any integer I there is a (2 + 1)-dimensional 
According to the chain rule subspace of wave functions on which the angular momentum operators 
ccording act in the way described in Corollary 8.3.2. This can be shown by explicit 


when an electron enters a magnetic field) we cannot simply use ordinary 
wave functions. What one in fact does is to combine the spin and orbital 
angular momenta by taking two-component wave functions, or spinors, 


8 a a ¥ oe 2 a = a = -a32- re mie, (8.26) solution of the equations 
Se th a a ba 2 
8 06 Ox, " O$ Baz” O$ Aas 1 Dm = Ub + 1)R Ym (8.31) 
so that 7 (s 97) L3tm = mMhtm, : 
-— = — XoP, = L3. F } 
i Ob XP, ged y in polar coordinates, which amounts to the analysis of Section 4.2. How- 
. ever, there is an alternative approach that identifies the angular momentum 
= t 
Now Lapm = Mhitm can be rewritten as wave functions as particular solutions of Laplace’s equation. This exploits 
the operator analogue of the vector identity 
Pi th (8.28) 


bP Ip? ~ (x.p)? = be x pl. (8.32) 
which integrates to 


Wm(r, 6, @) = eon n(r, 6, 0). (8.29) 
Proposition 8.4.2. Let D = XP, + X2P2+X3P3. Then 
On the other hand we want 


L? = X?P? _ D(D-—ih). 


Vm(T, 6, ?) = Vl, A,o+ 2m) = errimna,,(r, 6, ¢). (8.30) 


Unless m is integral this forces Wm to vanish identically, so that the inte- 
grality of m follows immediately. Since j is just the maximum value of m, 
it must also be integral. al 


Proof. We first note that, using Proposition 8.1.1, we have 
L? = Dy (X2P3 — X3P2) 
= (Xel + ihX3) P3 —- (X3L - inX2) P2 
= XoL, P3 — X3L,P2+ih (X2P2 + X3P3) ‘ (8.33) 


We must now try to reconcile this result with the existence of the spin 
representation for 7 = 3 which we explicitly constructed in Example 8.3.2. 
We have just seen that in general a rotation through 27 multiplies Ym by 
a factor of exp(2rim), which is —1 when j is an odd multiple of 5: In 
the abstract theory the vectors pm and —ym are regarded as defining the 
same state, and so there is nothing wrong with introducing a minus sign 
when we rotate through 277. Wave functions are more restrictive, however, 
since we need to assign to them a value at each point. Experiments have 
made it clear that the electron and many other subatomic particles found 
in nature have, in addition to any orbital angular momentum, an intrinsic 
angular momentum or spin given by j = 3. It is the sum of the orbital and 
spin angular momentum which is conserved, and not the orbital angular 
| momentum on its own. In situations where the spin matters (for example, 


Adding this to the analogous equations for L2 and L3 we get 
L? = €j—1X; 1, Pp + 2UhD. (8.34) 
The canonical commutation relations (Theorem 7.1.1) also give 
PjXx — Xj P= (XuPj ~ Xj Ph) — ib je = —esnrLr — ij. (8.35) 


Multiplying on the left by X;, on the right by P, and summing over j and 
k, this gives 


D? ~ X?P? = —6541.X5 LP, — ihD. (8.36) 


l 


ae 


fp ercreriepen nee strireert 


aaa eaenecna ene iaciiaa Meimeeeaeamea tone mnaiadir eraaicnntoabamenamendieiaiaramierninanemanttoimniametan aeheetCae Uliaia ire aeeeneET nee eal 


ri 
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Adding this to the last expression for L? yields Proof. The condition on F is included just to ensure normalizability. 


Euler’s theorem on homogeneous functions tells us that 
h ( oY oY oY 


DY = = | a1z— + 22—— + nap) = —thy. (8.41) 
which reduces to the stated relationship. Oo a Oz, Ox2 0x3 


L? + D? — X?P* = ihD, (8.37) 


Together with P?Y¥ = —A?AY =0, this gives 


L*Y = x*p*y — D(D-ih)Y 
Corollary 8.4.3. The Laplace operator may be written as = —ilh(—ilh —ih)Y = U(l + 1)R°Y. (8.42) 


1 @ i Thus L? has the correct value. 
A= ee aja . We have already seen that L3 = —iiO/0¢, which commutes with A 
and consequently preserves the space of harmonic functions. It similarly 
preserves the degree of homogeneity and so maps YV; to itself. By symmetry 
the same applies to LZ; and Lz. 


Proof. We have P? = —h*A, X? =r?, and Now a homogeneous polynomial of degree / in three variables has (/ + 
1)(l + 2)/2 coefficients and so that is the dimension of the space of such 
coe hog a he as (8.38) polynomials. The two differentiations of the Laplace operator lower the 
t t Or’ degree to | — 2, so that Laplace’s equation imposes /(/ — 1)/2 constraints 
on the coefficients of the resulting polynomial AY. Thus the space Yj has 

Recta Bas Lee aw a Pl 8.39 dimension 41)(142) WL—1) __ al¢2 
xh [pa-rz(rg-+1)]. (8.39) sue?) )_ ia» ) a+? = 241. (8.43) 
Rearranging, this gives (Alternatively one can show that the eigenvalue [A for L3 is non-degenerate. 
is 9 1 This is best done by using Corollary 8.2.4 to identify these eigenvectors with 
=i [re = L’, 8.40 vectors killed by D4. a) 

a r Or (« or + 1) Rr? ee _ 

which then simplifies to give the stated result. oO Example 8.4.1. Polynomials of degree 0 are just constant and so au- 


tomatically harmonic. They give the one-dimensional space and trivial 
representation discussed in Example 8.3.1. 


Theorem 8.4.4. Let F be a fixed function of r € (0,00) that satisfies Example 8.4.2. Polynomials of degree 1 can be written in the form 


Ya(x) = ax = @121 + a2%2 + agzs, (8.44) 


foe) 
2(1+1) 2 
i T |F(r)|" dr < 00, where the components a1, a2, and a3 may be complex. These are also 


automatically harmonic and provide the three-dimensional space described 
and let Vj be the space of functions of the form F(r)Y(x) with Y a in Example 8.3.3. We can readily find the basis used there by noting that 
polynomial that is both harmonic and also homogeneous of degree J. 
Then V, is a (2! + 1)-dimensional space on which Li, D2, and Lg act 


h a re) 
D3Ya = = (age a tap) (a1 + G2X2 + a3z3) 
with total angular momentum I(! + 1)A7. — 4 One Oz 


h 
ie (z1a2 — a1). (8.45) 
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Thus Ls (a.x) = 0 if and only if a, = a2 = 0. The polynomial Yo(x) = x3 
is therefore afi eigenvector of L3 with value 0. Similarly 


+(x) = = (wi + iz) (8.46) 
| 
satisfies 8.47) 
Los = ths. (8. 
Example 8.4.3. There are six independent polynomials of degree 2, the | 


a ge ’ e 
quadratics x?) 22, 23, Z2%3, Tati, and L122, and this Ming Depleeee eae 
tion imposes @ genuine constraint, since, for example, x? is certainly uo. 

. 2 ron i 
harmonic. Each of the quadratics 2? — xf and 2-7% 
Laplace’s equation. bot they are related by the idewtity 


wi 7 < k sativa 


gi ag = (ap ~ of) + (ag - a3). (8.48) 


There are therefore just five independent quadratics on which the or- 
bital angular momentum operators act, and forming a basis for the five- 
dimensional space when | = 2. 


8.5. Centre of mass coordinates 


We can now return to central force problems and furnish @ more complete 
analysis than Was possible in Chapter 4. We start by considering the general 
problem of two particles with masses m; and mz interacting through a force 
that depends only on the distance between them. More precisely, writing Y1 
and rg for the positions of the two particles, we assume that the interaction 
is described by a potential V(|ri—ra|). The wave function v for this system 
is a function of both r; and ro, and it satisfies the Schrédinger equation 


5 . 
—-—-A,WV — Ay +V (|r. — rel)’ = EY, (8.49) 
2m 


where A, is the Laplacian for the j-th particle and E is the total energy: 

To handle this system we adopt a trick from classical mechanics and 
change to centre of mass coordinates: that is, we introduce the total av 
M =m,-+ m2, the centre of mass position vector R = (miri + mere) /M, 
the difference vector r = r1 — Te, and the reduced mass 


-1 
1, me 8.50) 
ae) ame 
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It is then a routine exercise in partial differentiation (see Exercise 8.1) to 
check that 


1 1 1 1 , 
— —A,=—A —A 51 
mie aa 2 Vi RTT : (8.51) 
where Ap and A denote the Laplacian operators for the coordinates of R 
and of r, respectively, so that Schrédinger’s equation can be written as 
A? 


he 


Prevossition 8.5.1. The equation 


2 
PAW 4 V(r)U = BY 
2m 


a Any = 


has separable solutions of the form YW = ¢(R)(r), where ¢ and wp 
satisfy the equations 


) 
~ang OR? = Erd 
and 
nt V = EF 
om AY +V(r)p = Ey, 


with Ep +E = Ey. 


Proof. This straightforward substitution is left for the reader. Oo 


The first equation simply represents the force-free motion of the centre 
of mass, whilst the second gives the motion of a particle of mass m in a 
potential V with fixed centre. This provides retrospective justification for 
the fixed nucleus assumption imposed in Section 4.1. The only change is 
that m should be regarded as the reduced mass of the system rather than 
the electron mass. However, the mass, m2, of a hydrogen nucleus is around 
1836 times the electron mass, m1, so that 


gree = 
™~ \ in” 1836mi 
_ (1837 \~* 
~ \1836m4 
1836 
= Te37 7 (8.53) 


reais eaasatectsh Siete hs 
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which differs from m, by only 0.05%. 


Remark 8.5.1. There are other situations in which the masses of the two 
particles are comparable and the centre of mass system makes a significant 
difference. For example, it is possible to have an electron orbit not a proton 
as in the hydrogen atom but a positron, a particle of the same mass as an 
electron but oppositely charged, forming what is called positronium. There 
are also bound atom-like states of oppositely charged quarks, the most basic 
known constituents of nuclear matter. 


8.6. Angular momentum states 


Once we have reduced to centre of mass coordinates to obtain the equation 


h? 


=u +V(r)y (8.54) 


we can exploit the connection between the Laplacian and the angular mo- 
mentum. Corollary 8.4.3 tells us that Schrédinger’s equation may be rewrit- 
ten as — 
h? 1 8? 
~ Im r Or? 
By Corollary 8.2.2 L? and L3 commute with P? and with r? and so com- 
mute with the Hamiltonian operator, 


iL 
(rp) + ml +V(r)p = Ey. (8.55) 


P2 
H => +V(r). (8.56) 


We may therefore choose a wave function w that is simultaneously a wave 
function for H, L?, and Lg. If we take w satisfying L?w -= 1(1+1)h? then 
the Schrédinger equation reduces to 


2 Ee? 
Fe (spe + Sy) sven = By, 7) 


which is precisely the radial equation (4.9) for separable solutions, Provid- 
ing retrospective justification for this. The analysis of the hydrogen atom 
then proceeds exactly as in Chapter 4. 


8.7. An algebraic solution of the hydrogen atom 


Pauli’s solution of the hydrogen atom problem in quantum mechanics ex- 
ploited a classical technique due to Hamilton, and, in component form, 
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Lagrange, who had shown that the vector, A = (mK)"1(r x p) x p+ r/r 
1s constant during classical motion under the potential —K /r. This vec. 
tor, which points towards the centre of attraction from the point of closest 
approach, and whose magnitude is just the eccentricity of the orbit, deter- 
mines the orbit. (It is sometimes called the Runge-Lenz vector, after two 
rediscoverers.) In quantum mechanics one can similarly define 


_ il 1 
A= 7 K(UxP-PxL)+5X, (8.58) 
which commutes with the Hamiltonian, satisfies 
A.L = 0, (8.59) 
and, by use of Proposition 8.1.1 and Corollary 8.2.2, is seen to satisfy 
[Z,, Ax] = thei Au. (8.60) 


After a good deal of strenuous algebra, working from the commutation 
relations, it is also possible to show that 

[A;.Ag] = ih(—2H/mK?)e jx L) (8.61) 
and 

A? = (2H/mK?) (L? +h?) +1. (8.62) 


For solutions of Schrédinger’s equation Hy = Ey, one introduces the new 
sets of operators 


jt = ; [tL + (-2B/mK?)t aA] (8.63) 


which, using equations (8.6 (8.61), are easil i 
. 61), seen to 
other, and to satisfy : rea ane 
he : 
JF, Je] = thes de. (8.64) 
They therefore provide two commuting sets of angular momentum opera- 
tors, and we shall be able to choose wave functions satisfying 


J# y= ja (Ga +1) RY. (8.65) 


Taking into account equations (8.59) and (8.62), we also have J+? = y~? 
which means that j, = j-, and 


2 we 
a(sPas *) +A? = —mK?/28, (8.66) 
which gives 
[454.(34 +1) + 1] A? = —mK?/28. (8.67) 
From this we immediately deduce that 
E = —mK*h? /2(2j4 +1)?, (8.68) 


in agreement with Proposition 4.3.1. 


ner een nS 
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8.8. The spin representation 


Although orbital angular momentum has to be integral we have sees 
seen that there are operators that obey the angular momentum commut i 
tion relations with half integral j, for example the Pauli spin ae vf : 
j= 3. The spin matrices have many fascinating properties, some of whic 
are summarized in the following result. 


Theorem 8.8.1. Let o = (o1, 02,03) denote the Pauli spin matrices 
and let 


g.a =a = ayo, + Gee2 + 2393. 


Then a ar 
(i) for any a € R3 the matrix a.o is a self-adjoint matrix with trace 


0 and every tracefree self-adjoint matrix is of this form; 
(ii) for all a,b € R® we have 


(a.o)(b.0) = (a.b)1 + 4(a x b).o. 


Proof. (i) Equation (8.24) gives the expression 


_ a3 a@&— ee) (8.69) 
ae a, + a2 —az 


which is obviously self-adjoint (hermitian) and has trace 0. Conversely, if 
X is any 2 x 2 self-adjoint matrix then it is easy to check that 


X= ; ftr(X)1 + tr(Xo1)o1 + tr(Xo2)o2 + tr(Xo3)o3] . (8.70) 


So if tr(X) = 0 then we have the required form. - 
: aa tenahay part, it is easy to calculate from Definition 8.3.1 that the 


Pauli spin matrices satisfy 


of =], : (8.71) 
0102 = i903, 0203 = 101, 0301 = 102, (8.72) 

and similarly 
0201 = —103, 0302 = —io1, 0)03 = —id2. (8.73) 
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From these relations part (ii) is easily checked. (It may also be checked by 
a direct calculation of the left-hand side, see Exercise 8.11.) - Q 


Equations (8.72), (8.73) immediately give what are sometimes called the 
anticommutation relations for spin: 


Oj0K + OKO; = 265k: (8.74) 


These can be combined with the commutation relations, in the following 
corollary: 


Corollary 8.8.2. 


[a.o, b.o] = 2i(a x b).o 


(a.o)(b.o) + (b.o)(a.c) = 2(a.b). 


Proof. These can be derived by adding and subtracting the formulae for 
(a.o)(b.o) and (b.o)(a.o) given in the preceding theorem. a 


8.9. Historical notes 


The quantization of angular momentum was first suggested by J.W. Nichol- 
son in 1912 in an early attempt to apply quantum theory to the atom. Un- 
fortunately he worked with J.J. Thomson’s ‘plum pudding’ model of the 
atom as a ball of positive charge in which the electrons were embedded, 
which Rutherford’s experimental results were just showing to be unten- 
able. However, a year later Niels Bohr combined the idea of quantizing 
angular momentum with Rutherford’s picture of the atom as a miniature 
solar system to provide a highly successful theory of atomic spectra. 

The Pauli spin matrices hed in fact appeared in mathematics long before 
Pauli’s rediscovery of them in the context of quantum theory. In view 
of their abstract derivation it may seem somewhat surprising that they 
have turned out to be useful in nature. However, it then transpired that 
they could be used to describe the intrinsic spin of the electron. This is 
the context in which they were investigated by Pauli, who was trying to 
understand certain puzzling features of the spectra of atoms. There is a 


ree ae TA CERES ES ORR TESEEPOPEET PRE AEST ENA) PSTN RSS NAS RAN A RONNIE TT EHEC MERTON, AAT TARR ASR BSTC rrmene ara rater many ns SB Worgeamreamm sere fe ous $= eae 
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certain irony in the fact that, although Pauli introduced the idea of a ‘non- 
classical two-valuedness’, he initially dismissed the suggestions of Kronig, 
Goudsmit and Uhlenbeck that this had anything to do with electron spin. 
Later Stern and Gerlach found direct experimental confirmation of the 
idea when they showed that after passing through a highly inhomogeneous 
magnetic field a beam of electrons splits into two beams corresponding to 
the two possible spin states of the electrons. 


Exercises 


8.1 Prove equation (8.51) and Proposition 8.5.1. 


8.2 By comparing the formula of Corollary 8.4.3 with the expression for 
A in spherical polar coordinates (r,@,¢), or otherwise, show that L? 
can be written as 


ane) a 1 ay 
ia liraes! len ren ee : we 2 Sees Ee 
Lp = — 08 06 (smere) + aap aer 


Hence, or otherwise, show that there exist separable solutions of the 


equations 
Lp = 1+ 1)hp 


L3p = mhyp 


of the form ; 
yp = e*™? P™ (cos 8). 


Find expressions for P? and P? in the form of power series. 


8.3° The operators L,, Le, and Lg satisfy the angular momentum com- 
mutation relations and Di = L; +iL, and the vectors ~, satisfy 


LD bm =ULtE DR and = Labm = mh. 
By considering (m|L2.%m), or otherwise, show that 


(bm|Livm) = (Wi L3vm) 


and find the value in terms of | and m when w,, is normalized. Show 
also that 
(bm|LiLabm) = dimh?. 


8.4° The Hamiltonian for a particle moving in a magnetic field is given by 
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where pz is a positive constant, and L3 is the third component of the 
orbital angular momentum operator. By considering a simultaneous 
eigenvector ¢p,1.m of H, L?, and Ly with corresponding eigenvalues 
EB, l(1+ 1)h?, and mh, respectively, prove that there exists a non- 
negative integer N such that 


1 m : 
SoM eb 
ge G +N4 i) ; 
What are the possible values of m and 1? 


8.5° The components of the orbital angular momentum operator are given 
in terms of the spherical polar angles 6 and ¢ by 


Ly =ih (sno + cos ¢ cot oss) ; 


dg 
Le = th (- cos on, + sin d cot ox) : 
la = ~ihs. 
Show that 
Ly = hei? (3 + icot 9) 


and determine the simultaneous eigenfunctions of L? and Lg in terms 
of 6 and ¢ when the eigenvalue of L? is 2h?. 


8.6° A particle of mass m moves in the spherically symmetric potential 


Show that the energy levels are given by 


é need 1 1\? 7 
= = co 2 


eas l is the orbital angular momentum quantum number and n = 
3,. 
ae Se Vzpeces 


-2 


8.7° Show that the components of the orbital angular momentum operator 
L =X x P and the components of X satisfy 


LX=0=X.L 
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and derive the commutation relation 
[L?, [L?, X]] = 2h?(L?X + XL”). 
Deduce that for all @ and » 
(LAg|ap) — (L?g|L?b) + (@1L4y) = 2h? ((L74|Xd) + (4X L?9)). 

A system has eigenstates Wim such that 

Dim = UL + UR dim, 

Lsthim = mht. 
Show that the matrix element (%y m’|im) vanishes unless 


(i Na-Y 14 UY) + U4 2) =0. 


8.8° The Hamiltonian for a rigid tody rotating freely about the origin is 
2\: 
pela i) ’ 
~ 2\ A, Ap As ; 
where Aj, Az, and Ag are the principal moments of inertia. If the 
orthonormal vectors pm satisfy 


bin = Ul + 1)f?dim, 


given by 


L3Vim = Mhitbim, 


show that (#in|Him) is given by 
242 


ae Se ee | 2, hm 
Pe So EAD) a 
(Z - is) aa 2Ag 


if n = m, by 


ee ee —~m(m+£ 1)III(L +1) — (m£1)(m £2)]}? 
EG a) {e+ 2) = mom & A) : 


ifn = m+ 2, and that all the other matrix elements of HT vanish. 
Deduce that the energy levels for the rigid rotator for which 1=2 are 


p 2 1 wf 1 1 
ee () " (e+): 
Tata) Fata) Tats 


Find a general formula for the energy levels when A; = Ao. 
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8.9° The Hamiltonian for a kinematically symmetric rigid rotator is given 
by : 
H = 5A(U} + JB) +4072, 


where the self-adjoint operators J,, Jo, and Js satisfy the angular 
momentum commutation relations. Prove that the eigenvalues of the 
Hamiltonian are 


3 [Aj(j +1) +(C — A)m?] ri? 


with m = —j,-j+1,...,9 and j =0,3,1,.... 


8.10° A charged particle moving in a uniform magnetic field is confined to 
the plane x3 = 0 at right angles to the field. The Hamiltonian is given 
by 


1 
ae yn [(PP + PZ) + 6?(X? + XB) + 26(Xi Pa — X2P,)], 


where P; and P. are the momenta conjugate to the coordinates X 1 
and X2 (respectively), 1 is the mass, and f a positive constant. It can 
be shown, and you may assume, that the component L3 = X 1P2- 
X2P, of the angular momentum commutes with H. By considering 
solutions of the time-independent Schrodinger equation of the form 
exp(imé) exp(—Gr?/2h) f(r) where r and @ are the usual plane polar 
coordinates in the z;22-plane, or otherwise, show that the energy 
levels are hG(2N + 1)/pu, where N = 0,1,2,.... In establishing that 
the energy levels are of the above form, where, if anywhere, did you 
use the fact that Lz and H commute? 

[You may assume that 


Ope nO Ok Ge pene 
Ox? * 22 + Or Or | Og? 19 On, 30 


8.11 Prove the result of 8.8.1(ii), which asserts that 
(a.o)(b.o) = (a.b)14 i(a x bo). 


Let a.o = ajo) + agon + a303, where o1, o2, and og are the Pauli 
spin matrices. Show that 


sin |a} 


la| 


7a 


e**7 = cos lal +ia.o 
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8.12° 


8.13° 


8.14 


ANGULAR MOMENTUM 


The spin operator S is self-adjoint and satisfies the commutation re- 
lations 


(91, S2] =ihS3, [$2,593] =ihS1, [S3, 51] = thSo, 


and S? = 3h. Show that the eigenvalues of S3 are +h. 
Assuming that the eigenvalues of S3 are non-degenerate show that 
by a suitable choice of phase 


Ads = Ad— and A*o_ = ids, 


where A = S; —7%S2 and $+ are the eigenvectors of S3 corresponding 
to the eigenvalues +4h. 

Two spin 4 particles with spin operators S(1) and S(2) are coupled 
so that the Hamiltonian is 


kS(1).S(2). 
Show that 


_ A*(Q)AQ) + AA") 


S(1).S(2) 5 


+ S3(1)S3(2). 


Hence, or otherwise, obtain the energy eigenvalues. 


Suppose that K,, Ke, and K3 satisfy the commutation relations 


[K1, Ke] =ih(a’K3—6),  ([Ke,K3)=ihKi, (Ks, Ki] = thK. 
Show that J; = alk, Jo = a7 lKe, J3 = K3 - B/o? satisfy the 
relations for angular momentum. Hence, or otherwise, show that 
there are solutions in which K? + K? —2@K3 + a*K? + (8/a)? takes 
the value j(j+1)fi?a? and K3 has the eigenvalues mi+6/a* with m in 
the same range as before. Show that when @ = 1, a = {(J + L)A]-2 
and j = J, the eigenvalues of K3 are of the form (n+ alg with 
0 <n < 2J, whilst the value of K? + K2? — 2K3 + [(J + 3)h] 1K} 
is -i/(4J + 2). What happens to the commutation relations and to 
these values in the limit as J — oo? 


Let M;, M_, and H be operators satisfying the commutation rela- 
tions 


[H, Ms] = +M3, [M;, M_] = f(H), 


where f is a function that can be expanded in a convergent power 
series. Show that M4. maps an eigenvector of ker(H — ps) to ker(H — 
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B+ 1). 

Show (without worrying about convergence questions) that for 
any function g that can be expanded in a power series Mi9(H) = 
9(H+1)Mz.. Hence, or otherwise, show that M,M_+M_M,+9(H) 
commutes with M,, M_, and H provided that 


[f(H +1) + f(H)] + [9H +1) - 9(H)] =0. 


Show that the following choices are possible: 
(i) f(H) =aH + Ai, 
9(H) = aH? + BH; 
(ii) f(H) = sinh(26H)/ sinh(28), 
9(H) = [cosh(2@H) — 1]/[cosh(2) — ij. 
By mimicking the argument for angular momentum show that in each 
of these cases there are operators on a (21 + 1)-dimensional space 
satisfying the given relations. 


To what physical systems do the cases a = 0 and 8 = 0 in (i) 
correspond? 


send tmataree tn 


9* Symmetry in quantum theory 


It has been rumoured that the group pest Is gradually being cut out of 
quantum mechanics. This js certainly not true as far as the rotation and 
Lorentz groups are concerned. 

H. Weyl, Group theory and quantum mechanics, 1930 


9.1. Group representations 


Physical systems often have some kind of obvious symmetry, such as the 
rotational symmetry of the central force field about a nucleus. Mathemati- 
cally this means that the system is invariant under the action of a symmetry 
group G. (The main ideas of group theory are reviewed in Appendix A1.) 
Such a physical symmetry must also be present in quantum theory in a way 
that is compatible with the other important structures, namely the vector 
spaces and inner products. The simplest way in which this can happen 
is for G to act on the inner product space by unitary transformations, 
since these respect both the linear structure and the inner product. This 
suggests the following definition: 


Definition 9.1.1. A unitary representation of a group G on an inner 
product space H is a homomorphism from G to the group of unitary 


operators on 71. 


In quantum mechanics we assume that H # {0}. Since there should be 
no risk of confusion we shail usually abbreviate the terminology and refer 
simply to a representation of G. If we write U for the homomorphism then 
for each z in G we have a unitary operator U(x) such that 


U(xy) = U(z)U(y) (9.1) 


for all z and y in G. It is worth noting that by combining the unitarity 
and the group homomorphism property we have 


U(x)" = U(a)~} Ss U(2-'). (9.2) 


The definition is really justified by the wealth of examples of represen- 
tations appearing in quantum theory. We shall give a few of them here and 
some more in the exercises at the end of the chapter. 
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ore 9.1.1. First a general mathematical example: every group G 
as a trivial representation on the one dimensional space C, which maps 
each group element to the identity operator, that is 


U(x) =1 (9.3) 
for all z in G. 


Example 9.1.2. If A is a rotation of R3 
. we can define 
on ordinary wave functions by Serer rea 


(U(A)w)(r) = var), (9.4) 


Since rotations preserve volumes the i j i 

; ntegral defining the inner product 
will be unchanged, so that U (A) is unitary. For any rotations A and B and 
any wave function w we also have 


(U(A)U(B)¥)(r) = (U(B)p) (A712) 
= (B-1A7!r) 
= ¥((AB)~*r) = (U(AB)y)(r), (9.5) 


which shows that U is a representation of the rotation group. 


Example 9.1.3. This argument can be extended to give a representation 
for all orthogonal transformations. One perticularly interesting case occurs 
when one takes the two-element group consisting of +1. This is generated 
by the parity operator, P, corresponding to the orthogonal transformation 


—1 (cf. Exercise 6.3). Since P? = 1, the parity operator has eigenvalues 
+1. The eigenvectors satisfy 


o(—r) = (Pyp)(r) = +¥(r) (9.6) 


so that they are odd or even functions, depending on the sign of the eigen- 
value. The kinetic energy of a particle, since it depends on second deriva- 
tives, is insensitive to sign changes. For a particle moving in an even 
potential V(r) = V(—r) the Hamiltonian therefore commutes with P. The 
eigenfunctions of H may therefore be chosen to be eigenfunctions of P as 
well, that is odd or even functions. 


Example 9.1.4. If a is an element of R® we can similarly define 


(U(a)¥)(r) = P(r — a) (9.7) 


to obtain a unitary representation of the group of translations. 
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9.2. Representations and energy levels 


Crucial to what follows is the notion of an operator which relates two 
representations to one another. 


Definition 9.2.1. Let U and W be two representations of the same 
group G on spaces H and K respectively. An intertwining operator for 
U and W is a linear transformation, T', from H to K which satisfies 


TU(z) = W(a)T 


for all z in G. If there exists an invertible intertwining operator then 
U and W are said to be equivalent. 


Remark 9.2.1. One also says that T intertwines U and W. Clearly the 
sum of two intertwining operators is an intertwining operator and so is a 
scalar multiple of an intertwining operator. 


Example 9.2.1. Let G be the rotation group and let U be the repre- 
sentation on normalizable wave functions defined in Example 9.1.2. Let 7 
be the Hamiltonian operator for a particle moving under the influence of a 
central force: 


‘i 
=—-— ‘ 8 
H oad +V(r) (9.8) 
Intuitively we know that since it describes a central force H is rotationally 


invariant, and the mathematical expression of this fact is that H is an 
intertwining operator. To see this consider 


(U(A) Hp) (r) = (H)(A™r). (9.9) 
Introducing s = A7~!r we may write this as 


2 
(U(A)H¥)(r) = (Hb)(s) = -2(92¥)(s) +V(s)¥(s), (9.20) 


2m 


where we have written V2 to make clear which variable we are differentiat- 
ing. However, since s is related to r by a rotation and since V? is the same 


; : a “ss atts mee cae Eat Sou na, Se 
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in any set of Cartesian coordinates, we have V2 = V2. Since 
e i = Ve. we clear! 
have V(s) = V(|A~1r|) = V(r) this means that ea 
h? 
(UANHW)() = (5-VE + V(r) oA 
= (HU(A)y)(r)-- (9.11) 
Since this is true for any wave function w= we deduce that 
U(A)H = U(A)H; (9.12) 
in other words, that H is an intertwining operator. Similarly, if V is an 


even function then the Hamiltonian operator intertwines the action of the 
group generated by the parity operator. Oo 


Sileaabeeg of the quantum system often leave subspaces of H invariant, 
and so help us to split the problem into smaller, simpler pieces. 


Definition 9.2.2. Let U be a representation of Gon H. A subspace 
K of 1 is said to be invariant under U if U (z)K CK for all z in G. 


Lemma 9.2.1. If T intertwines U i 
2.1. and W then ker T is j i 
under U and imT is invariant under W. So 


Proof. If Ty = 0 then TU (2)y = W(x)Ty 

= = 0, so that U(x) is i 
ker T for all x in G. This means that U (z) ker T C kerT so a ae is 
invariant under U. Similarly, for any ~ in H. we have 


W(z)(Tp) = TU(z)y), (9.13) 
which shows that W(z) maps the range of T to itself. a 


Corollary 9.2.2. If T intertwines U w 


oie ith itself then eve 
of T is invariant under U. a 
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Proof. The A-eigenspace of T is ker(T — 1). Now the intertwining con- 
dition means that U(r)T = TU(z) for all x in G, from which we readily 
deduce that 


(T — A1)U (a) = U(2)(T — Al). (9.14) 
This means that T — 1 is also an intertwining operator, from which it 
follows that the A-eigenspace is invariant under U. oO 


Applied to the case in which the Hamiltonian is an intertwining operator 
for some representation of a group G, we see that each of the energy levels 
must form an invariant subspace. In particular, if the potential V is an even 
function then the eigenspaces for H are invariant under the action of the 
parity group. This means that the eigenspaces for H will themselves split 
into eigenspaces for P, that is we may as well take our energy eigenstates 
to be odd or even functions. This shows up very clearly with the harmonic 
oscillator where the eigenfunctions were either odd or even. 


Definition 9.2.3. If K is a subspace of H which is invariant under 
U, then the restriction of U(x) to K is called a subrepresentation of 


U. 


This means that each energy level carries a representation of any sym- 
metry group G for which the Hamiltonian is an intertwining operator, so 
by studying the possible representations of G we can obtain information 
about the energy levels. For example, if we can show that the represen- 
tation is more than one dimensional then we know that the energy level 
must be degenerate. Usually, though, we are able to obtain more detailed 
information than that. 

Before leaving the topic we note that invariant subspaces give some 
information about the rest of the space as well as about themselves. 


Theorem 9.2.3. If K is an invariant subspace under the unitary 
representation U then so is K+, and U is the direct sum of the two 


subrepresentations obtained by restricting to the subspaces K and 
KA. 


Proof. Let £ be an element of K+, and # an element of K. Then we see 


that 
(Uz él) = (JU (@)*d) = (EIU (e*)Y) = 0, (9.15) 
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since U(x—1)y is in K by invariance, and € is in K+. Thus U(z)€ is in K+, 
showing that K+ is also invariant under U. The fact that U is a direct sum 
of its subrepresentations follows directly from the fact that H = K@K+. O. 


This means that we can regard U as the direct sum of its restrictions to 
K and to K+, in the sense of the following definition. 


Definition 9.2.4. If Uy and Up are unitary representations of G on 
Hi and He respectively, then there is a representation U = U; @U, 
on H = H; ® He called the direct sum defined by 


U(x) (v1 ® 2) = Ui (x) yr © Ua(x) po 


for x in G, #, in #1, and te in Hg. 


9.3. Irreducible representations 


We have just seen that a representation that leaves a subspace invariant 
can be decomposed into two smaller pieces. This fact suggests a special 
role for those spaces that cannot be split any further. 


Definition 9.3.1. A unitary representation U on a space H for which 
the only invariant subspaces are {0} and H is said to be irreducible. 


Remark 9.3.1. Any one-dimensional representation is automatically ir- 
reducible since its only subspaces are {0} and #. In one dimension every 
linear operator is multiplication by some scalar, and the operator is unitary 
if that scalar has modulus 1. 


Theorem 9.3.1. Let U be a unitary representation on a finite- 
dimensional space. Then U is a direct sum of irreducible subrep- 
resentations. 


Proof. We proceed by induction on the dimension of the space H. If 
dim# = 1 then U is itself irreducible and the result is obvious. If # is 
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not irreducible then it contains a non-trivial invariant subspace K, so that 
dimK and dimX+ are both less than dim. By the inductive hypothesis 
both K and K+ are direct sums of irreducible subrepresentations, so that 
the result follows from Theorem 9.2.3. a 


Remark 9.3.2. Most of the representations of groups on wave functions 
described in Section 9.1 are infinite dimensional, so that this result does 
not apply directly; however, we have seen that individual energy levels also 


carry representations and since there is usually only a finite degeneracy - 


these are finite dimensional. 


In infinite dimensions the situation is more complicated than this the- 
orem might suggest, and some representations cannot be so decomposed 
into a direct sum of irreducibles even if one allows an infinite number of 
terms in the sum. However, for some groups such as the rotation group it 
can be shown that every representation does decompose in this way. 

The group of rotations of the plane provides a very striking example of 
this when we consider its natural representation on periodic wave functions 


given by 
| (U(a)¥)(8) = ¥(8 — a) (9.16) 


for @ and @ in [0,27]. The function ~ can be expanded uniquely into a 
Fourier series 


(6) = >> cneir?, (9.17) 


n=—00 


The functions e,,(@) = exp(in@) thus form a basis for the space. Moreover, 
(U(a)en)(0) = eO-2 = e~iM%e, (9), (9.18) 


so that each basis vector spans an invariant subspace. Being only one di- 
mensional these invariant subspaces are irreducible, so the Fourier series 
provides a way of decomposing the representation U into a direct sum of 
irreducibles. We can therefore regard the decomposition of a representa- 
tion into irreducibles as a generalization of Fourier analysis. Bearing in 
mind the importance of Fourier series in problem solving, this suggests 
that the decomposition of a representation into irreducibles is likely to be 
an extremely valuable tool. 


9.4. Abelian groups 


It is no coincidence that the irreducible representations of the planar rota- 
tion group appearing in the Fourier series are one dimensional. We shall 
now show that this is always the case for abelian groups. 
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Theorem 9.4.1. (Schur’s lemma) If T intertwines two irreducible 


representations U and W then either T = 0 or T is invertible. 


Proof. By Lemma 9.2.1 kerT is a U-invariant subspace, and so by the 
irreducibility of U it must be either {0} or H. If T # 0 then we must 
have kerT’ = {0}, which means that T is one-one. Similarly imT is W 
invariant, and if T ~ 0 then its image must be the whole representation 
space of W, so that T is onto. Thus if T is non-zero it is both one-one and 
onto, and therefore invertible. i) 


Corollary 9.4.2. If T intertwines an irreducible representation U 


with itself then T is a multiple of the identity operator. 


Proof. The crucial point here is a theorem that for some complex number 
d the operator T — A1 is not invertible. In finite dimensions this just says 
that T has an eigenvalue, which is a well-known consequence of the fact that 
the characteristic polynomial has at least one complex root. An analogous 
theorem holds more generally although we shall not prove it. 

We have already observed in the course of proving Corollary 9.2.2 that 
when T is an intertwining operator so is T — Al. Since, by hypothesis, 
T — X41 is not invertible, Schur’s lemma tells us that it must vanish, that is 
T= 41. a 


Corollary 9.4.3. Every irreducible representation of an abelian 


group G is one dimensional. 


Proof. Let U be a unitary representation of G. Since G is abelian we 
have 


U(x)U(y) = U(zy) = U(yz) = U(y)U(z) . (9.19) 


for all z and y in G. This means that for every z in G the operator U(x) 
is an intertwining operator. If U is irreducible then the preceding corollary 
tells us that for each z, U(z) is just multiplication by some complex scalar 


lanes 


= a ca ara mats eI aoe 
a 
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(which will depend on x). This means that any non-zero vector a spans 
an invariant one dimensional subspace, and by irreducibility this must be 
the whole of #1. Oo 


This result opens the way to classifying all irreducible representations 
of abelian groups. It is convenient to introduce the notation T for the 
multiplicative group of complex numbers of modulus 1: 


T={zeC: |z|=1}. (9.20) 


Example 9.4.1. Take G = Z. We have already noted that one-dimensi- 
onal representations are just multiplications by scalars of modulus 1. We 
therefore seek a homomorphism 


U:Z3T. (9.21) 
Let us write U(1) = exp(i6). By the homomorphism property we then have 
U(n) = U(1)" =e”, (9.22) 


so that U is completely determined by the real number @. On the other 
hand it is easy to see that this formula does define a representation of Z 
for any real value of @, though values of 6 that differ by 27 give the same 
representation. 


Example 9.4.2. Let G = SO(2), the rotation group of the plane. Writ- 
ing a and f for angles of rotation, we want a T-valued function U such 


ia U(a)U(6) = U(a+ 8). (9.23) 


Since the arguments of U are angles, the function must be periodic with 
period 27. There are obvious solutions of the equation when U(a) = 
exp(ina) for some integer 7, and we shall now show that there are no 
others, so that the irreducible representations of SO(2) are parametrized 
by integers. Assuming that U is integrable we may multiply the defining 
identity by exp(—in@) and integrate to get information about its Fourier 
coefficients. When supplemented with a change of variable, this gives 


27 on ; 
U(a) i. U(@)e7 in dp = i U(a ae Be"? dp 


Qn F 
= gina i, U(d)endd, (9.24) 
0 
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where the last step exploits the periodicity of U to change the limits of the 
integral after substituting ¢ = a+. Since U is non-zero at least one of its 
Fourier coefficients, the n-th say, must be non-zero, and so can be divided 
out to obtain ; 

U(a) = e**, (9.25) 


Using the isomorphism a+ exp(ia) between SO(2) and T we can deduce 
that the irreducible representations of T all have the form 


U(z) = 2”. (9.26) 


Example 9.4.3. Let G = R. Intuition suggests that the obvious ho- 
momorphisms from R. to T given by U(x) = exp(ikz) for real k are the 
only ones, and we shall now prove this. For any homomorphism U from 
R to T we may find a real number « such that U(27) = exp(2aix). Then 
W(z) = U(z) exp(—ixz) is not only an homomorphism but is also periodic 
of period 27. By the: preceding example we know that there is an integer 
n such that W(z) = exp(inz), so that 


U(x) = W(2)e** = ellnts)a (9.27) 


and putting k = n + « we obtain the expected result. (A similar argument 
may be used to obtain the irreducible representations of any abelian group 
G for which we know the irreducibles of both a subgroup [ and the quotient 
G/T.) 


Example 9.4.4. Let G be the direct product of abelian groups G; x 
Go x... Gy. Any element z in G can be written as (11, %2,...,%n), SO 
we may expand U(z) as the product 


U(z) =U, (x1)U2(ze2) wee Un(fn) (9.28) 


where each U; is a one-dimensional representation of G;. For example, the 
irreducible representations of R3 have the form 


U(x) = ebki21 etk2t2 etkazs — etk.x (9.29) 


This shows that plane waves can be interpreted as irreducible represen- 
tations of R3. Similarly, the irreducible representations of T” have the 
form ; 


U (21, 225+) Zn) = ahah? ... Zhe, (9.30) 


for some integers kj, ke,..., Kn. 
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9.5. Time evolution 


We have now determined all the irreducible representations of a number of 
useful groups including R, but reducible representations are also important. 
We saw in Section 6.4 that, for Hamiltonians which do not depend on time, 
Schrédinger’s equation can be integrated to give 


ve = U(t)vo, (9.31) 
where U(t) = exp(—iHt/h) is a unitary operator satisfying 
U(s +t) = U(s)U(t) (9.32) 


for all s and t in R. These are precisely the conditions for U to be a 
representation of R.. 


Proposition 9.5.1. The operators U(t) define a unitary representa- 
tion of R, that is for all s,t Ee R 


U(s)U(t) = U(s +t). 


Proof. We have already seen that the operators U(t) are unitary. As- 
suming (as may be proved) that exponentials behave in the usual way, we 
easily verify by a formal calculation that 


U(s +t) = U(s)U(t), (9.33) 


as required. (When s = -t this just tells us that U(¢) and U(—t) are 
inverse, as needed for the unitarity.) im) 


Although the above arguments can all be made rigorous there is some 
advantage in reversing the argument and starting from the postulate that 
the time evolution for a conservative system is given by a unitary repre- 
sentation. (Unitarity ensures that probabilities and transition functions 
do not change with time.) The following theorem then ensures the exis- 
tence of a Hamiltonian operator H, and Schrédinger’s equation follows on 
differentiation. 
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4 
3 


Stone’s theorem. 9.5.2. Let U(t) be a unitary representation of 
R on H, such that (~|U(t)y) is a continuous function of ¢ for all 
w € H. Then there is a unique self-adjoint operator H, called the 
infinitesimal generator of U, such that 


U(t) = e~tH/h 


for all t € H. The operator H is defined on the set of vectors, w, for 
which (if,/t)[U (t) — 1}w converges to a limit as t — 0; the limit is then 
Ay. 


Stone’s theorem, whose proof lies beyond the scope of these notes, de- 
scribes all the representations of R. and not just the irreducibles. Its scope 
is far wider than might at first appear because many other groups contain 
subgroups related to R as subgroups. To be more precise we need some 
more terminology. 

Let I and G be groups with a homomorphism ¢ from I to G. If U is a 
representation of G then the composition U o ¢ is a representation of I on 
the same space, since the composition of homomorphisms is a homomor- 
phism. If I is a subgroup of G, then there is an inclusion homomorphism 
¢ which sends an element of I to itself regarded as an element of G. Then 
U o ¢ is called the restriction of U to I’. This is more simply expressed as 
follows: 


Definition 9.5.1. Iff is a subgroup of G, and U a representation of 


G, then the restriction of U to I’ takes y in F to U(y). 


There are many homomorphisms from R to the rotation group SO(3), 
for given an axis in the direction of the unit vector n € R? we may define 
Ry(t) to be the rotation in a positive sense through an angle ¢ about the axis 
n. It is easy to check that the map taking t to R,,(t) is a homomorphism, so 
given any representation U of SO(3) we obtain a representation Uz, = UoRy 
of R. By Stone’s theorem there exists a self-adjoint operator Jp such that 


Un(t) = exp (-») (9.34) 


and which is given by 


axe 
Jn = ih Unit) (9.35) 


t=0 
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The significance of J, is easily seen by reference to Example 9.1.2. There 
we have 


(U (Rn(#)) b) (r) = ¥ (Ra(t)*r) = Y (Rn(-t)r) , (9.36) 


so that 


(Jn) (2) = sh (Ra(—1)r) 


t=0 


in |($ al-on).V8) (Ra(1))| (9.37) 


t=0 
Now Rn(0) = 1 and d(Rn(—t)r) /dt, being the derivative of a steady rota- 


tion about n, with unit speed, is given by the classical mechanical formula 
—n xr. Alternatively one can differentiate the explicit formula 


R,(—t)r = (1 — cost)(r.n)n + costr — sint(n x r). (9.38) 


Thence 
(Jat) (r) = —iA(n x r). Vp = —ifin.(r x V)y, (9.39) 
which is the formula for n.J given in Section 8.1. As the notation sug- 
gested we can therefore interpret the infinitesimal generator Jpn as angular 
momentum. 
This is no accident. If we take the subgroup {ta:t € R} C R and 
consider the representation of Example 15.1.4 we have the infinitesimal 


generator 


il 


in (U (tay) (x) 


(Pa) (r) 


t=0 


a a 
ihe, (er — ta) os 
= ih(—a.(V) ¥)(r), (9.40) 


so that P, is just the momentum in the a direction. 


Definition 9.5.2. The infinitesimal generator of a representation 


arising in this way is called a generalized momentum. 


We have seen above that both the linear momentum and the angular 
momentum are special cases of this definition. In the case of the rota- 
tion group one can show directly by choosing subgroups of rotations about 


different axes that 
(Um) Jal = thidmxn (9.41) 
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so that the commutation relations for the angular momentum are a conse- 
quence of the fact that U is a representation of SO(3). This- accords with 
the heuristic observation that angular momentum is most important where 
there is rotational symmetry. 


9.6. The irreducible representations of the rotation group 


Given the connection between the rotation group and angular momentum 
described in the last section, it is easy to find all the irreducible representa- 
tions of SO(3). We already found in Theorem 8.3.1 that the only ways to 
satisfy the angular momentum commutation relations required a (2j + 1)- 
dimensional space with a basis {pm : —j < m < j} of J3-eigenvectors 
satisfying 

J3m = mim. (9.42) 
This relation exponentiates to tell us that for rotations R3(t) about the 
third coordinate axis 


U(Ra(t))Wm = exp (-F%) bm = ey, (9.43) 


If we require that U(R3(27)) == 1 then m must be integral. This in turn 
forces j to be integral and recovers precisely those representations that 
could be realized on wave functions. If one takes the explicit realization 
of the angular momentum operators on spherical harmonics described in 
Section 8.4 then one can show that the rotation group acts on the wave 
functions as in Example 9.1.2. We shall denote the (27 + 1)-dimensional 
representation of the rotation group by D/, 


9.7. Characters 


When one recalls the amount of work that went into finding the solu- 
tion of the angular momentum commutation relations, one realizes how 
much more complicated multi-dimensional representations are than the 
one-dimensional representations of abelian groups. In this section we shall 
show that it is nonetheless possible to find a single complex-valued function 
that encapsulates all the information for any finite-dimensional represen- 
tation. 


Definition 9.7.1. Let U be a finite-dimensional unitary representa- 
tion of G. The character xy of U is the complex-valued function on 
G defined by , 


xu(«) = tr(U(e)). 
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Remark 9.7.1. Clearly if U is one dimensional then it is just multiplica- 
tion by yy, so that one-dimensional representations are sometimes called 
characters. In general the dimension of U is given by yy (1) where 1 denotes 
the identity in G. 


The main properties of characters are summarized below. 


Proposition 9.7.1. Let TUT~! denote the representation that sends 
x in G to TU(z)T~). Then for any x and y in G and any finite- 
dimensional representations U and W we have 


(i) xrur-1(z) = xu(z); 
(ii) xu(yzy7*) = xu(z); 
(ili) xvew (2) = xu(z) + xw(z). 


Proof. (i) We have by definition and the standard properties of traces 
Xrur-1(x) = tr(TU(xz)T~?) = tr(T~!TU(a2)) = tr(U(z)) = sa 
9.44 


(ii) By definition we have 
xu(yry") = tr(U(y)U(2)U(y)~"), (9.45) 


so setting 7’ = U(y) in the previous part we obtain xu(yzy~+) = xu(z). 

(iii) If {u;} and {wa} are orthonormal bases for the representation spaces 
of U and W respectively, then their union, {u;}U {wa}, is an orthonormal 
basis for their direct sum. Thence we calculate that 


xvew(2) = > (us|U (a) us) + >) (walW (2)Wa) = xu(x) + xw(x). (9-46) 
j a 


Remark 9.7.2. Part (i) shows that equivalent representations have the 
same character. The converse is also true and representations that have 
the same character are equivalent. We shall prove that for the rotation 
group in the next section. 


9.8. The characters of the rotation group 


According to Proposition 9.7.1 the value of a character depends only on the ~ 


conjugacy class of its argument. Now if S is a rotation and R,(t) denotes 
the rotation through an angle t about the axis n it is easy to check that 


SRy(t)S~* = Rgn(t), (9.47) 
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which shows that the conjugacy class of a rotation depends only on the 
angle through which it rotates and not on the axis. Let us write A;(8) for 
the character of a rotation through @ in the (27 + 1)-dimensional irreducible 
representation D/. : 


Theorem 9.8.1. The characters of the rotation group are given by 


A,(o) = BUG 4) 


sin(46) 


for integral j. 


Proof. We already observed that j must be integral. We may as well eval- 
uate the trace for a rotation about the third coordinate axis, and since this 


has the basis elements {7} as eigenvectors with eigenvalues {exp(—im6)} 
we have 


r] 
a,(o) = So em 
m=-—j 
e 8 e7+(29+1)8 ~1 
e- — 1] 
ei(it9)8 _ -ti+H)0 
ett? _ e7*38 
- sin[(j + $)6] 
sin(4é) 
i] 


The key to many of the further properties of the characters is provided 
by the following orthogonality theorem. 


Theorem 9.8.2. For any k and j in 3Z 


I AHA, (9)! — cos@ dO = 5p; 
0 7 20 a 
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Remark 9.8.1. The complex conjugation of A; might seem superfluous 
since our explicit formula shows the characters of the rotation group to be 
real valued; however, similar orthogonality relations hold for other groups 
whose characters may be genuinely complex, so it is more convenient to 
include it. 


Proof. ‘This foliows by a direct calculation since 1 — cos@ = 2sin? @, so 
that 


—cosé a 
cE Ano 5A. (8) — > db = F 
dé 


= [ sin(k+ 4) asin (t+ 3) 0%. (0.48) 
0 


As4(8) sin ($6) A (0) sin (30) © 


The usual Fourier orthogonality relations for sines now show that this is 
0 
Skt 


Corollary 9.8. Be eu character A of the rotation group can be 
expanded uniqu:: rhe form 


Proof. The axis of a rotation is determined only up to a sign and we 
know that Ren(@) = Ra(—@), so we must have A(—@) = A(@). In other 
words the characters are even functions. This means that A(6) sin (5 9) is 
an odd function that can be expanded ear as a series of sines 


A(@) sin (36) = 2m sin [(j + 4) 6], (9.49) 


with n; defined by the stated an From this we obtain the series 


= 577;A;(6), 


as asserted. o 
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9.9. The spin representation of the rotation group 


We only obtain representations of the rotation group when j is integral, but 
it is possible using the spin representation of angular momentum described 
in Section 8.8 to obtain representations of a closely related group for half. 
integral j as well. 


Definition 9.9.1, The group SU(2) is the group of 2 x 2 unitary 
matrices (that is, U*U = 1) with determinant 1. 


Theorem 9.9.1. For each U € SU(2) there is a rotation R(U) such 
that for alla € R? 


U(o.a)U* = o.R(U)a. 


Proof. According to Theorem 8.8.1 the matrix o.a is traceless and self- 
adjoint. It is therefore immediate that U(o.a)U* is also self-adjoint and 
that 


tr(U(o.a)U"*) = tr(U*U(o.a)) = tr(o.a) = 0. (9.50) 


Theorem 8.8.1 now tells us that U(o.a)U* = o.a’ for some a’ € R3. 
Moreover, this relationship between a and a’ is clearly linear. We also have 
the identity 


~ a3 aj—ta2\ 72, 2, 2 
det(o.a) = det e ee ae ) = —(aj + a3 + a3). (9.51) 


Since 


det(o.a’) = det(U(o.a)U*) = det(U) det(o.a) det(U*) = det(c.a), 


(9.52) 
we deduce that 
la’? = —|al?, (9.53) 
so that a’ = R(U)a for some orthogonal transformation R(U). 
From the identity 
[U(o.a)U*, U(o.b)U*] = 2iU(0.(a x b))U*, (9.54) 


fe, Sa daca 
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with the help of Corollary 8.8.2, we deduce that 
R(U)a x R(U)b = R(U)(a x b) (9.55) 


so that R(U) respects the orientation of the vector product and must be a 
rotation. oO 


Theorem 9.9.2. The map U +} R(U) of the last theorem defines 
a homomorphism from SU(2) onto the rotation group. Its kernel is 


{+1}, so that the rotation group is isomorphic to SU(2)/{+1}. 


Proof. Since 
o.(R(UV)a) = UV(o0.a)V*U* = U(o.(R(V)a))U* = 0. (R(U)R(V)a), 
(9.56) 
we see that R(UV)a = R(U)R(V)a for all a € R® so that R(UV) = 
R(U)R(V) and FR is a homomorphism. 

Using the unitarity of U, the kernel of R consists of those U € SU(2) 

for which U(o.a)U* = o.a, or equivalently, 

U(o.a) = (o.a)U, (9.57) 
for all a € R°. Since, in particular, U commutes with o3 it must be a 
diagonal matrix, and since it commutes with oc, its diagonal entries must 
be the same. That means that U is a multiple of the identity, \1, say. Now 
1 = det(U) = A? forces \ = +1 and gives ker(R) = {+1}. 

One can prove that every rotation Ry (t) is in the image of R by checking 
that it is the image of U = cos(t) —isin(4t)o.n. (See also Exercise 8.11.) 
The first isomorphism theorem for groups now tells us that the rotation 
group is isomorphic to SU(2)/ker(R) = SU(2)/{+1}. 


The homomorphism R gives an immediate connection between the rep- 
resentation theory of SU(2) and the rotation group. 


Corollary 9.9.3. Any representation W of the rotation group lifts 


to a representation R*W = Wo R of SU(2). 
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In the proof of Theorem 9.9.2 we saw that the rotation through an angle 
t about the third coordinate axis is the image of the diagonal matrix 


U = cos(4t) — isin(4t)o3 = Gre " ) (9.58) 


0 exit 
We write a, = exp(Sit) and a2 = exp(—Jit) for the two diagonal en- 
tries. Using Theorem 9.8.1, we can obtain the following expression for the 
character: 


_ sin{(l + 5)t] 7 owt! _ ase? 


R*A' (a) = Al(t) = HH (9.59) 


ay — a2 


This last expression shows that R*A! makes perfectly good sense as a 
character of SU(2) when J is half-integral too. oO 


9.10. The hidden symmetries of hydrogen 


The components of the Runge-Lenz vector, A, in Section 8.7 provide addi- 
tional constants of the motion for the hydrogen atom, suggesting that there 
might be additional symmetries. We shall now show how the spectrum of 
hydrogen can be obtained by group theory. s 

We start by Fourier transforming Schrédinger’s equation. Writing ~ = 
Fw and remembering that the Fourier transform of the product Vw is a 
convolution, we obtain 


Fite) + ig [Pe-ailad'a= 20). (960) 


It is straightforward to show that the Fourier transform of Ke~""/r is 
K,/2h/7/(p? + 6h), so that, letting x — 0, and substituting E = 
—p%/2m, we have 


re ae is 
PtP Fy) 4 / Ip — al-?d(q)d%q = 0. 


2m Qn2h (9.61) 


So far the calculation has been quite conventional, but we now introduce 
the matrix variable, 


Z = Zy + iZ.0 = (po + ip.o)?/(p5 + 7”), 
which satisfies 


(9.62) 


Z*Z = (10 — ip.o)?(po + ip.o)?/(p§ +p)? = 1, (9.63) 


RRR AANE Moos cannibal simeaeerdln is atte 
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so that Z is unitary. Similarly det(Z) = 1, so that Z is actually in the 
group SU(2). Letting W = (po + ta.0)?/(p3 +9”), we may calculate that 


ip ~ al? = 42 det(Z — W)/det(1 + Z)det(1+ W), (9.64) 


so that writing &@(Z) = det(1 + Z)-?p(p) and dW = dW,dW2dW3/n?W, 
we have 
mK ~1 
W(2) = j det(Z — W)"W(W)aW. (9.65) 


This shows that Schrédinger’s equation for the hydrogen atom is equiv- 
alent to an integral equation on the group SU(2), which we shall now solve 
using its characters described in the previous section. 


Lemma 9.10.1. Let x; = R*A%/? denote the character of SU(2) 
which takes the value (7+? — a73~*)/(a1 — a) for a group element 
whose eigenvalues are a, and az. Then 


det(Z —W)-? = S°x,(W27? 
j 


Proof. We first observe that, if the eigenvalues of W are a and az, then 


det(1 — W)7! = (1—@1)7* (1-a2)7* 


=> xi(W) (9.66) 
j 
In general we have 
det(Z — W) = det(Z) det(1 — Z7W) = det(1—- Z~'W), (9.67) 
from which the result now follows. a 


From this we deduce that © satisfies the equation 


U(Z) = are pe fot xj(Z-1W)u(W) dW. 
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Lemma 9.10.2. For any operator p on the dy-dimensional space of 
the irreducible representation C* = R*D*/? of SU(2) one has the 
identity 


tr(pC*(W)) 
detiz - wy 4 = 


= tr(pC*(Z). 


Proof. The representation C* has character x, and d, =k+1. We start 
by considering 
tr(pC*(UWU—!)) 
+ dw, F 
/ detrw) ey) 
and make the change of variable, Y = UWU~!. According to Theorem 
9.9.1 conjugation by U just rotates W, so that dY = dW and so the 


integral becomes 
ke 
| tr(eC*(Y)) ay. 
det(1 - U-!YU) 


Since det(1 - U-!YU) = det(U-1(1 — Y)U) = det(1 — Y), this gives the 


(9.70) 


identity : 
tr(epCk(UWU-')) sf tr(pC*(W)) 
i det(1 - W) =f dete (ee) 
As this is true for all p we deduce that 
_ CRW) ck(W) 
k k 
chy) lam ay Or = fag ay (0-72) 


showing the integral on the right to be an intertwining operator for the 
irreducible, C*. By Schur’s lemma 9.4.1, this means that it is a multiple 
of the identity: 


Ck(w) 
Taking traces, we have 
XW) aw = 
det(I — W) BW eet: ee) 


On the other hand, using the expression (9.67) for det(1 — W)—! as a sum 
of characters, and the orthogonality relations analogous to Theorem 9.8.2, 
we see that the left-hand side is 1, so that « = ae, This gives us 


c*(W) 
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and, multiplying by pC*(Z) and taking the trace, we obtain 


‘eZ mas ee 


ef aan (9.76) 


We finish the proof with another change of variables, replacing W by 
Z-1W, and using the identity (9.68), to arrive at the result. Qo 


This lemma provides a ready source of solutions to the Schrédinger equa- 
tion, since we may take mK/fpo = dy and ¥(Z) = tr(pC*(Z)) for any p. 
The corresponding energy is 


(9.77) 


and there are d? independent choices of p, in agreement with the usual 
formulae for the energy and degeneracy. 

To complete this approach we shall sketch why these are the only solu- 
tions. To do this we multiply equation (9.69) by x.(ZX~*) and integrate, 
to obtain 


mK f[ xn(X7*Z) 


(9.78) 


Now, setting p = C*(X~") and interchanging W and Z in the lemma we 


have 
Xn(X~1Z) 


det(W ~ Z) 
Since det(W — Z) = det(Z — W), this may be substituted into equation 
(9.79) to give 


dx dZ = x(X—'W). (9.79) 


K ~ 
di / xe(ZX~*)0(Z) = 5 i xX W)E(W) dW. (9.80) 
0 
According to equation (9.69) the integral cannot vanish for all k unless 
W = 0. For any k giving a non-zero integral we then have mK. /fipo = dk, 
confirming our earlier result. 


9.11. Wigner’s theorem 


Symmetries of a quantum system are not always realized by unitary rep- 
resentations. Physically all that matters is the action on states (which are 
unchanged when the vector is multiplied by a scalar) and that the transition 
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probabilities, |(#|1) |?/||@||?||v|[?, are unchanged by the action of the sym- 
metry group. This could also be ensured by taking antiunitary operators 
U(g), that is operators such that 


(U(9)¢lU(9)¥) = Id), 


since this is just the conjugate of (¢|1). In fact, we have the following 
result: 


(9.81) 


Wigner’s theorem 9.11.1. Let G be a group that acts on the states 
of a space H in such a way as to preserve transition probabilities, 


old)? /Ilo IP Ie. 


Then up to scalar multiples the action of g € G can be written in the 


form 7 — U(g)y, where each U(g) is either unitary or antiunitary, 
and 


U(zy) = o(a,y)U(x)U(y) 


for some complex number o(z,y) of modulus 1. 


There are physically important examples for which the operators are 
antiunitary rather than unitary. Many physical systems are unchanged 
by time reversal and this provides one of the simplest examples of the 
occurrence of antiunitary operators. Let us define the action of the time 
reversal operator T on a wave function to be 


(TH)(E, r) = v(t, r). 


If we just change t to —t¢ in Schrédinger’s equation with a time-independent 
potential then it becomes 


(9.82) 


2 
io, r) = --V9 +Vy% (9.83) 
so that complex conjugation turns it back to the usual Schrédinger equa- 
tion. The combination of reversing the sign of t and conjugating, which 
occur in 7, therefore gives a symmetry of the system. Another common 
example of an antiunitary operator is provided by the charge conjugation C 
which reverses the charge of each particle and conjugates its wave function. 
We shall discuss this in more detail in Section 18.3. 


‘meet aint es 1. intents, 
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Nonetheless, for many systems only unitary operators can occur. 


Weyl’s lemma 9.11.2. Under the assumptions of the previous the- 


orem, U(g*) is unitary for all g € G. 


Proof. We know that U(g?) = o(g,g9)U(g)*. If U(g) is unitary then so 
also is its square, but even if U(g) is antiunitary its square satisfies 


(U(9)*4|U(9)*) = (U(g)¥|U(g)¢) = {¢lv), (9.84) 


and so is unitary. Since o(g,g) just multiplies by a scalar U(g?) is unitary 
in either case. Qa 


In the rotation group every rotation through @ is the square of a rotation 
through 46 about the same axis, and so must be represented by a unitary 
operator. The same applies to most of the other groups that we have 
considered. It is not so easy to get rid of the scalar multiplier o(z,y). In 
fact, we have already encountered one example where it is needed, in the 
case of the representations of the rotation group when j is a half-integer 
but not integral. Then, because of the sign ambiguities, the most that 
can be said is that U(xy) = £U(x)U(y), so that we are dealing with a 
multiplier that only takes the values +1. In general, when a is needed, one 
says that U is a projective representation with multiplier oc. Fortunately, 
although some important physical examples are described by projective 
representations, in many cases it is possible to get rid of o and take U to 
be a unitary representation. 


Exercises 


9.1 Show that the cyclic group Zy of order N, with generator z, has 
representations given by U(z*) = exp(2rkmi/N), forr =0,...,N—1, 
and that every irreducible representation is of this form. 


9.2 Let a be the cyclic permutation (123) in the symmetric group $3, and 
let b be the transposition (12). Show that 6 and a are of order 2 and 3, 
respectively, and that ba = a?b. Show that there is a representation, 
U, of S3 with 


ue)=(% x} veo=(? O), 


a ee 
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Provided that w? = 1. Show that if w # 1 then it is irreducible. 


9.3 Let 1 be the space of complex-valued functions on a finite group, G, 
equipped with the inner product 


(lb) = S° d(9)4(9), 


g&G 


for any ¢ and # in 1. Show that 


(U(x)b)(9) = w(2~1g) 
defines a unitary representation of G. 


9.4° The vectors e;, e2, and e3 form a right-handed orthonormal triad in 


R°. A group multiplication is put on R3 by defining the product of 
two vectors to be 


xoy=x+y + dle3,x, yleg, 


where the square brackets denote the triple scalar product. If U is a 
representation of this group show that 


U(x)~*U(y)U(x) = U(y — [e3,x,y]es). 


Show also that, for j = 1,2, or 3, the map taking the real number 
t to U(te;) defines a representation of R. If A; is the infinitesimal 
generator corresponding to U(te;) show that 


U(x)~1A2U(x) = Ag — (x.e1)A3 


and 
U(x)~!AgU (x) = A3. 
Deduce that 
[Ay, Ag] = if.A3 


and thence that, if U is irreducible, then either Az = 0 or X = Az*Ay 
and P = Ap satisfy the canonical commutation relations. 


9.5° Let U be an irreducible unitary representation of the rotation group 
SO(3) on a finite-dimensional inner product space V. Suppose that 
there is a non-zero vector Q that is fixed by U(h) whenever A is a 
rotation about the axis k in R3. Show that it is possible to define a 
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one-one linear map T' from V to the functions on the sphere S$? = 
{ue R3: |u| = 1} by 

(T¥)(gk) = (U(g)Q\y), 
for g € SO(3) and # € V. Show that 
(TU(x))(u) = (Ty)(2~*u) 
for all z € SO(3) and u € S?. Show that there is a unique repre- 
sentation, U, on V = C3 whose action on real vectors is the usual 
rotational action on R°. By taking 2 = k, show that the image 


of T then consists of the restriction to S? of functions linear in the 
components of u. 


10 Measurements and paradoxes 
eee 


Quantum mechanics Is very Impressive, but hardly brings us any closer to 
the secrets of the Old One. | am at any rate convinced that He does not 
play dice. 


ALBERT EINSTEIN, letter to Max Born, 4 December 1926 


10.1. The quantum Zeno paradox 


We noted in Section 6.5 that quantum measurements change the system 
which is measured. The abrupt change of state caused by a measurement is 
quite different from the steady evolution described by Schrédinger’s equa- 
tion, and this shows up particularly clearly if we imagine carrying out 
frequent repeated measurements on an evolving system. 

Let U be any operator on H and Py the projection onto a normalized 
vector ¢, that is Pyyp = (¢|¥)¢. Then PyU Psp = (6|U¢) Pgy, and by 
induction we see that 

PgUP ZU Ps ...U Ps = (b|Ud)" Po, (10.1) 
where n is the number of Us appearing on the left. Now taking for U the 
unitary time evolution operator defined by the Hamiltonian H, and using 
Lemma 7.3.1, we have 


(61U.jn6)” = exp (-n- eH) : no) f +o (3)| 


sexp( ej (mje io (e (10.2) 
= p i r) Onhe oO ne . 7 
As n — 00 for fixed ¢ this tends to exp(—itEy(H)/f) so that 


at 
PeU jn PU tn Po iets Urn Py — exp Gia) Py. (10.3) 


Thus a very large number of measurements regularly repeated at intervals 
t/n causes the system to evolve almost as though it were in an eigenstate 
with energy Ey(H). Since multiples do not affect the physical state the 
system never evolves away from ¢. When the expectation value vanishes 
then we even have 


PU en PU en are PeU iin Py > Py. ; (10.4) 
This effect is known as the quantum watched pot effect or the quantum Zeno 


paradox (by comparison with the classical Zeno paradoxes which seemed to 
show that motion is impossible). 
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10.2. Bell’s inequality 


Even more alarming than the abruptness of the projection which accom- 
panies a measurement is the fact that the change in the wave function is 
not just local but instantaneously pervades the whole of space, in a way 
which seems to defy our classical notions of space and time. The moment 
that the energy of the harmonic oscillator is measured to be thw the wave 
function is transmuted to a multiple of wo throughout the entire universe. 
This offends common sense and seems to contradict the assertion of rela- 
tivity theory that information cannot be transmitted faster than the speed 
of light. This is at the nub of most of the famous paradoxes of quantum 
mechanics, and it led Einstein and Schrédinger, amongst others, to reject 
quantum theory as a complete theory of physics. 

To appreciate the differences between classical and quantum measure- 
ments it is useful to examine some simple features of classical probability 
theories first. As long as one is dealing with only one observable at a time 
there is essentially no difference between classical and quantum probability. 
However, this changes as soon as one starts to consider questions involv- 
ing two or more observables at the same time, such as joint distributions 
or conditional probabilities: interference effects can arise which make the 
quantum probabilities quite different from their classical analogues. 

Let us first consider the classical probability, P(R\S), of the event that 
R occurs but S does not. 


Lemma 10.2.1. For any events Q, R, and S, we have 


P(Q\R) + P(R\ S) > P(Q\ 5). 


Proof. This can easily be seen from the Venn diagram in Figure 10.1, or 
by the following argument. Any point g € Q\S is either in R or not in R. 
In the first case it is in R\ S and otherwise it is in Q \ R. This shows that 
(Q\ 8) € (Q\ R)U(R\S), from which we deduce the inequality for the 
probabilities. Oo 


Remark 10.2.1. It can actually be shown (see Exercise 10.3) that 


(Q\ R)U(R\S) =(Q\S)U((QNS)\ R)U(R\(QNS)), — (10.5) 
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FicurE 10.1. From the Venn diagram one may see that P(Q \.R)+P(R\S) > 
P(Q\ 5). 


so that there is equality if and only if 


QNSCRCQUS. (10.6) 


This readily extends to larger numbers of events. 
Corollary 10.2.2. For n > 2 events Qi,Q2,...,;Qn, we have the 


inequality 
n-1 


2 P(Q5\ O41) = P(Q1 \ Qn): 


j=l 


Proof. This can be proved by induction on n, starting with the case of 
nm = 2, when the two sides are identical and there is equality. For n +1 
events the inductive hypothesis gives 


D5 P(Qs \ Q541) = P(Q1 \ Qn) + P(Qn \ Qn41)s 


jal 


(10.7) 
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FIGuRE 10.2. The distance between the two events R and S is the probability of 
being in one but not both, that is in the shaded region. 


and the result follows on applying the lemma with Q = Qi, R= Q,, ge 
S= Qn+1- 


We can also deduce from the lemma a useful result about the probabil- 
ity that just one of two events R and S occurs, which is the sum of the 
probabilities P(R \ S) and P(S \ R) (see Figure 10.2). 


Proposition 10.2.3. For any events Q, R, and S, D(R, S) = P(RU 
S) —P(RNS) satisfies: 

(i) D(R, R) = 0; eeu 

(ii) symmetry, D(R, S) = , R); 

(iii) the triangle inequality, D(Q, S) < D(Q, R) + D(R, S). 


Proof. The first property is obvious from the fact that R \ R is empty, 
and the second from the fact that D(R, S) is defined symmetrically in R 
and S. The third follows by adding the result of the lemma for the sets Q, 
R, and S$ to that when their order is reversed. Oo 


Remark 10.2.2. In topology a function D satisfying these three conditions 
and which only vanishes when R = S, is called a distance function or metric. 
In fact our D(A, S) vanishes if and only if R and S always occur together 
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unpolarized 
light 


FIGURE 10.3. Unpolarized light incident on a vertical polarizing filter from the 
left emerges vertically polarized. 


(never one without the other), so if we identify two events which can only 
occur together then D makes the set of events into a metric topological 
space. This result has been known to mathematicians since the 1920s, but 
its relevance to quantum theory was first appreciated by John Bell, who 
independently rediscovered an equivalent form of the triangle inequality 
for expectation values in 1964 (see Exercise 10.2). In 1969 Clauser, Horne, 
Shimony, and Holt extended the idea to four events (the n = 4 case of 
Corollary 10.2.2). 


10.3. Polarization 


Although at first sight it is rather disconcerting that a measurement can 
change the state of a quantum system, there is an everyday example of the 
same sort of phenomenon. Light can be polarized horizontally, vertically, or 
at any intermediate angle by passing it through an appropriate polarizing 
filter. When confused by the paradoxes of quantum measurements it is 
often helpful to take some sheets of polaroid and see how the light behaves 
in the corresponding situation. Polarization is also important because it 
has provided some of the most sensitive checks of quantum theory so far. 
The effect of a vertically polarizing filter on a light wave can be visualized 
as in Figure 10.3 above. One can also measure the polarization of a light 
beam by passing it through a filter, and comparing the intensity of the 
light which emerges with that incident on the filter. Since the filter only 
transmits light polarized at the appropriate angle, the measurement has, 
just as in quantum theory, affected what is measured. 
Mathematically the polarization states of light can be described by vec- 
tors in a two-dimensional inner product space # spanned by orthogonal 
vectors € and 7, which represent vertical and horizontal polarizations re- 
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Figure 10.4. Malus’ law tells us that a proportion cos” @ of the light transmitted 
by the first filter will also pass through a second filter polarized at an angle 9 to 


the first. 


spectively. If the polarization of the incoming beam was described by v 
then after passing through a vertically polarized filter it becomes (€|7)€. In 
other words the filter simply projects out the component of in the direc- 
tion of €. This is exactly the same as the effect of quantum measurement. 
Light polarized at an angle 6 to the vertical is described by the vector 


we = Ecosé + 7sin 8. (10.8) 


When the initial beam is polarized at an angle @, so that ¥ = we, this gives 
(€|we)€ = Ecos@. This means that when we think of light as consisting 
as photons rather than waves, the probability of a photon passing through 
the filter is cos? @, and the intensity of light emerging from the filter has 
the same angular dependence. This result, which can easily be verified, 
was discovered by Etienne Malus in 1809 and is known as Malus’ law. (See 
Figure 10.4.) When 6 = 7/2 no light can pass through both filters. 

When thinking of light as a stream of photons one must guard against 
the temptation to regard these as classical particles. The dangers become 
apparent when one considers the effect of inserting a third filter between 
the first two. A naive view of a filter is that it blocks photons which are not 
polarized in the appropriate direction, which suggests that the insertion of 
a third filter must result in even more photons being stopped, and so ina 
diminution of the light emerging from the system. 

On the other hand we can easily calculate the intensity of the transmitted 
light directly. Let us suppose that the middle filter is polarized at an angle 
¢ to the vertical (Figure 10.5). Since the middle filter makes an angle 0— 
with the first a proportion cos?(@ — ¢) of the light passes through. At the 
third filter the probability of getting through is cos? ¢, giving an overall 
probability of transmission cos?(6— ¢) cos* ¢. This must be compared with 
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Paid ‘aa A third filter peed at an angle ¢ to the vertical between the two 
ows a proportion cos“ @ to pass and then a proportion cos*(6 —~ ¢) can 


pass the second filter as well. Overal : 2 
three filters. verall a proportion cos @ cos?(6 = Co) passes all 


the eee, cos? 8 before the middle filter was inserted. Gk need onl 
ace the case when the outer filters are at right angles (0 = 1/2) 
see that the naive picture is quite wrong. Without the middle filter the 


probability of transmission i 2 = : ; ; 
. y smission is cos“ 7/2 = 0, whilst with the middle filter it 


2(= : 
cos (5 ~_ $) cos? $ = sin? $cos? ¢ = zsin? 26, (10.9) 


which is positive for most an 
. sitive gles. In other words, the middle filter, f, 
ee nena: itinerant photons, actually enhances their chances of pci : 
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of 


RT EPRI AOS TORENT eee pee iesioere 
r S ARR 17 ET PARRA TE PT ROE POE TIES RET YY ARMS AN IICETT MRA ETT TERRE ERR certeoman ren — — ae 2 kien 
; : Ss SM . ag rateamegerite aera easenaee creme ne rar et 2PM EN COE IRR a tne Helmet FO AS commer RE met onde tare 


166 MEASUREMENTS AND PARADOXES 


sin 6 


Ficure 10.6. By the sine rule the sides of the triangle are proportional to sin ¢, 
sin(@ — ¢), and sin @. Bell’s inequality amounts to the condition that the side sin 0 
should be less than the hypotenuse of a right-angled triangle with sides sin @ and 
sin(6 — ¢). This is clearly felse if 7 — 0 > 1/2, that is if r/2 > 6. 


r, and letting Q be the event that a photon passes gq, we similarly calculate 
that P(Q \ R) = P(Q) sin?(6 — ¢) and P(Q \ 8) = P(Q) sin?(6). 

The problem is that these need not satisfy the triangle-type inequality 
of Lemma 10.2.1. We shall see later that it is easy to create a beam for 
which P(Q) = P(R). Then the triangle inequality is violated whenever 
0<¢<6< 7/2. To see this, consider a triangle whose edges are parallel 
to the polarization directions of the three filters. (See Figure 10.6.) By the 
sine rule the sides have lengths proportional to sin 6, sin(@ — ¢), and sin ¢. 
By the cosine rule we have 


sin? 9 = sin? ¢ + sin?(9 — ¢) — 2sin dsin(@ — ¢) cos(m — 6), (10.10) 
which can be rewritten as 

sin? ¢ +-sin?(6 — ¢) ~ sin? @ = —2sin¢sin(6 — ) cos. (10.11) 
Multiplying by P(Q) = P(R) and taking 0 < ¢ < 6 < 7/2, we have 

P(Q\ R)+P(R\ S)—P(Q\S) <0. (10.12) 

It is obvious that this apparent violation of our earlier result stems from 

a careless use of probability. It is clear, for example, that the event S that 

a photon passes filter s does not have the same meaning in P(Q \ S) as it 


has in P(R \ S), because the photon is changed during transmission by ¢ 
or r. 


f 
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When @ = 2¢, so that the angles between successive filters are ¢, the 
violation of the triangle inequality is proportional to 


2sin? ¢ — sin? 2¢ = —2sin? ¢ cos 2¢, (10.13) 


which is negative for all small angles ¢, rather than positive as the triangle 
inequality would demand. One could do even better by adding another filter 
t at an angle ¢ to s. For light which is equally likely to be transmitted by 
any one of the filters, one obtains 


P(Q\R)+P(R\S)+P(S\T)—P(Q\T) = P(Q) (3sin? ¢ — sin? 3¢) , (10.14) 
giving an even larger violation of Corollary 10.2.2. 


10.4. The Einstein—Podolsky—Rosen paradox 


Useful though the example of the polarization filters is, it does not disturb 
us unduly because we can easily comprehend how successive filtering of light 
is different from the conjunction of events in classical probability theory. 
However, there is a simple modification of the idea of filtering which does 
seem truly paradoxical.. The idea goes back to Einstein who was never 
happy with the primacy of statistical laws in quantum theory. In 1935, in 
collaboration with Boris Podolsky and Nathan Rosen, he discovered one of 
the most puzzling paradoxes of the theory, which exhibits very clearly the 
grounds for his unease. 

The paradox relies on the fact that conservation laws often provide in- 
formation about one part of a system in terms of another. For example, if a 
stationary atom spontaneously decays into two fragments, their momenta 
must be equal and opposite. Measuring the momentum of fragment A tells 
us the momentum of the fragment B as well. This suggests that we might 
be able to beat the uncertainty principle by measuring the position of B 
and the momentum of A. Combining the information would give both the 
position and momentum of B. 

Similar considerations apply in the case of an atom which emits two 
identically polarized photons. By conservation of momentum these travel 
in opposite directions, and we could arrange for each photon to encounter 
a polarizing filter at some distance from the atom. Since the photons 
have identical polarization their behaviour at the filters must be correlated. 
Thus a measurement of the polarization of either photon can be interpreted 
as a measurement of the polarization of the other (Figure 10.7). If the 
two measurements are made at sites so widely separated that even a light 
signal communicating the result of one measurement would arrive only 
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Ficurs 10.7. When an atom emits two identically polarized photons, A and B, 
the effect of passing A through ry, is the same as if B passed through rg. 


after the other measurement had occurred, it seems implausible that either 
measurement could be influenced by the other, for no known forces travel 
faster than light. More precisely, in these circumstances we expect that 
the transmission of photon B by a filter rg really does depend only on 
the photon and the filter and not on anything which has happened to 
A, so that it constitutes a well-defined event Rg. We shall call this the 
locality assumption. It means that probabilities such as P(Qa \ Rp), the 
probability that photon A is transmitted by filter g4 but photon B is not 
transmitted by rg, may be unambiguously defined and should satisfy the 
normal probabilistic rules. 

Now, in quantum theory there is a single state vector for the two photons 


wenea * (e(ra)é(re) + n(ra)n(re)] (10.18) 


which must immediately change whenever one of them passes through a 
filter. An elementary calculation shows that for any angle 8 one can also 
write 


v(tasta) = 5 [Ve(ra)Po(rs) + ¥x/2-0(FA)Px/2-0(ts)], (10.16) 


where tg = Ecos +7sin@ and, similarly, px/2-6 = neos@ —€sin8. This 
means that, whatever its polarization angle, 9, there is a probability of 
a half that a filter will transmit photon A. Then the wave function is 
projected onto e(ri)ve(ra) /-¥2, and the second photon is also polarized 
at the same angle. The net effect would be the same had the second 
photon passed through a filter at that angle instead. Every change in the 
polarization state of one photon must therefore be matched by its distant 
companion no matter how far dispersed they may be. 
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This is the nub of the paradox: whilst it is easy to see that a photon 
might itself be changed by passing through a filter, it offends our under- 
standing of local forces that another should immediately suffer the same 
fate, and we rather expect that distant photon to remain unchanged, at 
least for a time, so that measurements on it obey the classical probabilis- 
tic rules given earlier. Quantum mechanics, however, allows no respite: 
its effects are immediate and all-pervading. The effect on the wave func- 
tion when photon A passing through a filter r, is indistinguishable from 
that when photon B passes through a filter rg at the same angle. If A 
encounters r4 and B encounters a filter sg, the effect should therefore 
be the same as if B had encountered both rg and sg. This means that 
oe \ Sg) = P(Rg\Szg). But this presents us with a problem (see Figure 


Proposition 10.4.1. Let Q,4 and Sy, be the events that a photon 
A is transmitted by the polaroid filters g4 or s4, respectively, and 
similarly Rg and Tg the events that photon B is transmitted by 
filters rg or tg. Then for some filter angles the quantum mechanical 
probabilities satisfy 


P(Qa \ Rp) + P(Re \ Sa) +P(Sa \ Ta) <P(Qa \ TB). 


Proof. We know that the probability of transmission by a filter at any 
angle is . We have also seen that 


P(Qa \ Rp) + P(Rp \ Sa) + P(S4 \ Te) — P(Qa \ Ts) 


= P(Qz \ Rg) + P(Re \ Sp) + P(Sp \ Ts) —P(Qs \ Ts), 
(10.17) 
and we saw at the end of the last section that this can be negative. In 
particular, if the angles between the successive filters are all ¢, one expects 


P(Qa \ Rg) + P( Re \ Sa) +P(S4\Te) - P(Qa\ Te) 
= P(Q) (3sin? ¢ — sin? 3¢) , (10.18) 
as in equation (10.14). oO 
This directly contradicts the result in Corollary 10.2.2, and this time 


we cannot take refuge in the argument that the events are not well de- 
fined, because the locality assumption says that they should be. Bell was 
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FiGuRE 10.8. By carrying out measurements on both photons it is possible to 
determine quantities such as P(Q.4 \ Sg), and so to check Bell’s inequalities ex- 
perimentally. 


the first to realize that quantum mechanics gives predictions which are in- 
consistent with the inequalities for classical probability derived in Lemma 
10.2.1. Many have believed that the statistical features of quantum theory 
simply arise out of our ignorance of what is going on at very small length 
scales, much as statistical mechanics overcomes our ignorance of the de- 
tailed motion of all the molecules in a gas. Quantum theory could then 
arise as an average over classical ‘hidden variables’, whose small size has so 
far precluded their detection. Many mathematicians and physicists investi- 
gated this idea and showed that the most obvious hidden variable theories 
could not reproduce some of the predictions of quantum mechanics, but 
Bell’s observation shows that there are always differences between the pre- 
dictions of any local hidden variable theory and quantum mechanics. It has 
formed the basis for many subsequent experiments to test the predictions 
of quantum measurement theory. 


Clauser, Horne, Shimony, and Holt, following an idea of Bohm, suggested 
that this could provide a practical test of quantum theory against classical 
ideas. An atom of calcium is irradiated by two lasers which excite it to 
a state with higher energy. It subsequently decays back to its original 
state emitting two photons of wavelengths 551.3 and 422.7 nanometres, 
respectively. Since the angular momentum of the initial and final states is 
the same one can show that the two photons emitted in opposite directions 
are identically polarized, so that we are in the situation described above. 
By putting polarizing filters in the paths of the two photons one can find the 
correlations between them and check whether the inequality of Corollary 
10.2.2 holds. More precisely one takes each filter at an angle ¢ to the 
preceding one, so that, using (10.18) and recalling that P(Q) = 5, quantum 
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FIGURE 10.9. Schematic diagram of Aspect’s measurements of the violation of 
the triangle inequality. The experimental points lie close to the theoretical curve 
3sin? ¢ — sin? 3¢, and well below 0. 


mechanics predicts 


P(Qa\Rp)+P(RB\Sa)+P(Sa\Ts)—P(Qa\Ts) = ; (3sin? ¢ ~ sin? 3¢) , 

: * (10.19) 
whilst classical theory predicts the left-hand side to be non-negative. (In 
practice the experiments tend to use D(Ra, Ss) = P(Ra\Ss)+P(Ss\Ra), 
but, since the probabilities depend only on the angles, this just doubles 
everything and gives 3sin? ¢ — sin? 3¢.) 

Experiments of this kind were carried out by Freedman and Clauser and 
others in the early 1970s and in a more sensitive form by Aspect and his col- 
laborators a decade later. Their results verified that the triangle inequality 
for probabilities is indeed violated in the way predicted by quantum theory 
(Figure 10.9). 

In Aspect’s experiments it was possible to change the polarization di- 
rection of the first filter very rapidly so that it could be chosen after the 
photons had left the atom. The filters were far enough apart that the sec- 
ond photon would pass through its polarizing filter before any signal could 
arrive (even at the speed of light) to reveal which polarization direction had 
been chosen for the first filter. In this way direct communication between 
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the photons could be ruled out. The results of these experiments therefore 
seem to rule out local hidden variable theories. Non-local hidden variables 
cannot be excluded in this way, and some are able to mimic the results 
of quantum mechanics completely, so that they would be experimentally 
indistinguishable. These tend, however, to have other peculiarities (hidden 
variables which permit remote measurements to affect each other must be 
somewhat unusual), and it is questionable whether they achieve the origi- 
nal aim of providing a simple, intuitively appealing alternative to quantum 
theory. 


10.5. Mermin's marvellous machine 


In 1990 Mermin presented another particularly surprising quantum para- 
dox, based on an idea of Greenberger, Horne and Zeilinger. In his thought 
experiment a source emits triples of particles which then enter three dis- 
tant detectors, each of which can be set to measure one of two possible 
observables, X or Y, of the incoming particle. Simplifying the mathemat- 
ical structure of Mermin’s discussion in a minor way, let us suppose that 
each of X and Y can take only the values +1, so that X? = 1 = Y?. 
Suppose now that the source is set up so that whenever two of the three 
detectors observe Y and the third X the product of the measured values 
is —1. Writing X; and Y; for the observation by the j-th detector of X 
or Y, respectively, we have X,Y2Y3 = —1 and similarly ¥1X2¥Y3 = —1, 
Y,Y2X3 = —1. From this we may deduce that 


(X1Y2¥3)(¥1X2¥s)(¥i¥2X3) = (-1)° = -1. (10.20) 
In the classical case, where all observables commute, this reduces to 
(X1Y?)(X2Y7)(X3¥?) = -1, (10.21) 
and then, since Y? = 1, to 
X,XoX3 = -1. (10.22) 


If the detectors are distant enough that we do not expect them to influence 
each other, then the result of an observation of X by detector j should be 
independent of which observables the other two detectors are measuring, 
and X1X2X3 should also be the product of the results when all three de- 
tectors are switched to observe X. The answer in this case must therefore 
be —1. 

Let us now consider a quantum mechanical example of the same sort 
of measurement, in which the three particles each have spin i and the 
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observables are chosen to be X = ao, and Y = a2. The detector emits the 
three particles in the state w = (#444 + v--_)/V2, where ¥ii4 denotes 
the state in which all three beams have o3 eigenvalue +1. For a single spin 
3 particle Definition 8.3.1 gives o, wa. = pz, and oops = tives, so that 


XiYaYap = ip = —v. (10.23) 


More generally, whenever two detectors measure Y they obtain a product 
—1, so that our hypothesis holds. Observables for different particles still 
commute, so that we again have 


(XiY2¥3)(ViX2¥s)(Vi¥oXs) = (Xi¥?)(¥2X2Yo) (Ys Xs) 
= X4(Y¥oX2¥o)Xo. (10.24) 


However, this time the anticommutation relations for Pauli spin matrices 
mean that YoX2q = —X2Yo, so that the product reduces to —X1X2X3, and 
we have 


—XX2Xq = (X1Y2Y3)(ViX2¥s)(MY2X3) = 1. (10.25) 


The result of measuring X with each detector must therefore give the prod- 
uct +1. In other words the quantum mechanical experiment would yield 
exactly the opposite answer to that which is expected classically. 


10.6. Schrddinger’s cat 


Schrédinger shared some of Einstein’s mistrust of quantum theory. (He 
had originally hoped to interpret |y(x)|? as the actual charge density of an 
electron in an atom and not just its probability density.) The Einstein— 
Podolsky—Rosen paper appeared as Schrédinger arrived in Oxford as an 
early refugee from the Nazi regime in Germany. Some physicists had dis- 
missed the paradox as interesting but no cause for alarm, since one should 
never have expected the microscopic quantum world to be consistent with 
an intuition based on everyday experience. Schrédinger’s rebuttal of this 
argument pointed out that microscopic events could have important con- 
sequences in the everyday world too (see Figure 10.10): 


One can also imagine quite burlesque cases. A cat is penned up in 
a steel chamber with the following fiendish contraption (which must 
be secured against direct interference by the cat): a Geiger counter 
containing a minute quantity of radioactive material, so small that in 
an hour perhaps one of the atoms may decay, but equally probably 
none will. If a decay occurs it is detected by the Geiger counter, 
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FIGURE 10.10. Schrédinger’s cat. 


which activates a small hammer through a relay and smashes a phial 
of prussic acid. If one leaves this entire system alone, at the end of 
one hour one can say that the cat is still alive provided that no atom 
has decayed. The first decay would have poisoned it. The #-function 
for the whole system would express this by containing the living and 
dead cat mixed or blended together in equal portions. 

It is typical of such cases that an uncertainty which is originally 
confined to the atomic domain is transformed into a gross uncertainty 
which can be distinguished by direct observation. That presents an 
obstacle to naive acceptance of the ‘blurred model’ as a picture of 
reality. In itself there is nothing ambiguous or contradictory about it. 
It is the difference between a blurred photograph or one out of focus 
and a picture of clouds or fog-banks. 


At first sight it might appear that one could equally well have replaced 
the radioactive atom and phial of poison by a robot which tosses a coin 
and then shoots the cat if the coin comes down tails. However, in that 
case at the end of the allotted time the cat would certainly be either alive 
or dead inside the box although we would not yet know which. The new 
ingredient introduced by quantum mechanics is that until we open the box 
it is apparently possible for the superposition of live and dead cats inside 
it to create interference effects. In fact, as Asher Peres pointed out in 
1978, this paradox could have even more bizarre consequences. The two 
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Half-dead? Alive? 


FIGURE 10.11. Peres’ quantum mechanical device for resuscitating dead cats. 


possible states of Schrédinger’s cat (live and dead) can be described in the 
space C? spanned by vectors jive and Wdeaq. Mathematically this two- 
dimensional space is identical to that used for describing the polarization 
states of the photon, so we can try translating the polarizing filter ideas 
into the language of cats. Suppose that at the end of the hour we found 
the cat to be dead, that is in the state Ydeaaq- We might then carry out an 
observation to see if it were in some half-dead state described by the vector 
COS Odead + Sin OYive. We could then look again to see if it was alive. The 
effect would be much the same as inserting a third polarizing filter between 
two crossed filters, and there would be a probability of Asin? 26 that the 
dead cat had been resuscitated! (See Figure 10.11.) 

In 1961 Wigner suggested another variant of the idea, which essentially 
replaces the release of poison by the flash of a light, and the unfortunate 
cat by Wigner’s friend. One is then faced with the conundrum of whether 
the wave function describing the light and friend projects down as soon as 
the friend observes the light, or only when the door is opened. 


10.7. Many worlds and one world 


In the half century since these paradoxes were first proposed they have fas- 
cinated mathematicians, physicists, and philosophers alike. The attempts 
to resolve them are too numerous to recount, although some can now be 
ruled out in the light of Aspect’s results. Some have tried to change quan- 
tum mechanics in some way, so that the collapse of the wave function during 
a measurement is governed by a modified Schrédinger equation, and so fits 
into the same dynamical framework as the usual unitary time evolution. 
Others accept the mathematical formulation of quantum theory but seek 
to interpret it in a way which accords more with our intuition derived from 
classical mechanics. We shall briefly mention just two of these, one cho- 
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sen because it is often described in popular accounts of quantum theory, 
the other because it is based on a nice mathematical theorem. (Two more 
theories are mentioned in Section 15.1.) 

Hugh Everett III’s ‘many worlds’ picture accepts the mathematical for- 
mulation of quantum theory, but gives it a new physical interpretation. As 
we have seen the most perplexing feature of the formalism is the way in 
which a state is changed by the act of measurement. The many worlds 
interpretation is that, as a measurement is made, the universe splits into 
many different copies, in each of which just one of the measured outcomes 
occurs. Thus, as one opens the steel chamber to see whether Schrédinger’s 
cat is alive or dead, the universe bifurcates into one universe containing 
a decayed atom, a broken phial, and a dead cat, and another in which 
the atom and the phial are in pristine condition and a tetchy cat emerges 
unscathed. With its science fiction flavour, this is perhaps the best-known 
attempt to resolve the paradoxes. 

At the opposite extreme is an idea which stresses the unity of the uni- 
verse. No system is ever truly isolated: it interacts with the outside world 
by gravitational, electromagnetic, or other forces, or else we could not ob- 
serve it at all. The very act of measurement forces the system to interact 
with the measuring apparatus, and for an accurate measurement this in- 
teraction must be strong and swift. 

The following theorem, whose proof is given in Appendix A2, describes 
one possible consequence of studying the system and environment together. 


Theorem 10.7.1. Let V be a two-dimensional inner product space 
and Q a vector in V. Then there exists an inner product space H, a 
family of unitary operators U,, a homomorphism ¢ : L(V) — L(H) 
which respects adjoints, and for each vector 7 € V a vector W € H 
such that for all A € L(V) we have 


(U|d(A)¥) 
jim Ue¥|9(A)UEY) = 


= (p|Ay) 
(Q/AQ), 


where the inner products on the left are in # and those on the right 
are in V. 


We interpret V as the space for a system with two independent states 
(such as the polarized photon, spinning electron, or Schrédinger’s cat), and 
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H as that for the environment as well. (The restriction to two states is for 
simplicity only and is not essential.) The homomorphism ¢- tells us how 
the observables for the two-state system can be interpreted as observables — 
for that system together with its environment. Whatever the initial state, 
for large times the expectation value of any observable for the two-state 
system tends to the value given by the vector 2. Although the time evolu- 
tion is given by perfectly ordinary unitary operators U;, the effect on the 
system is just the same as the collapse to 2 during 8 measurement, except 
that it is only asymptotic and not immediate. However, the asymptotic 
state is approached exponentially like exp(—nt) where 7 is the strength of 
the coupling between the system and its environment. The time-scale for 
the collapse should therefore be about the same as for the measurement 
and, since on the subatomic scale things tend to happen fast, could easily 
be of order 10—)5 seconds. It would not be easy to distinguish such a swift 
exponential decay from instantaneous collapse. The ‘open system’ interpre- 
tation suggests that this is what happens: the description of measurements 
in terms of projections is just a useful approximation to the effect on the 
system alone of ordinary time evolution during periods of rapid change. 

One serious objection to this interpretation is the fact that the collapse 
is infinitely protracted: after any finite time it is still possible in principle 
to reverse the process. This might, however, be very difficult in practice if 
energy is radiated during the measurement, since within seconds this will be 
far out in space and there is no practical way of recovering it. Nonetheless, 
this might offer the possibility of testing such theories experimentally. 

The reader should not be worried if neither of these interpretations seems 
wholly convincing. Whether or not there are many worlds, there are cer- 
tainly many world views, and few explanations seem persuasive to more 
than a small band of enthusiasts. 


Exercises 


10.1 Show that, under suitable conditions, for any projection, P, and 
unitary evolution group, U; = exp(—itH/h), 


PU,P = (1 — itPHP/h)P + o(t?), 
and deduce that 


PU in PUtin... PUtjnP — exp (-ZPHP) P. 


10.2 Let Apgg be a random variable whose value is 4 if just ‘one of the 
events R and S occurs and u otherwise. Show that 


E(Ars) = AD(R, S) + u[1 -— D(R, S)] =n + (A- w)D(R, 5), 
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and hence that 
E(Ars) — # 2 E(Ags) — E(Agr). 
Deduce that 
~p + E(Ars) 2 |E(Ags) — E(Agr)|- 
[This last inequality is Bell’s inequality. When p = 0 it reduces to 


the triangle inequality.] 
Show that for any classical events Q, R, and S, 


P(Q\ R) + P(R\ S) — P(Q\ S) =P(QNS)\ R) +P(R\(QUS)). 


Show that, forn +1 > 3 filters q1,...,@n41, each at an angle ¢ to 
its predecessor, and Q, the event that a photon is transmitted by q;, 
one has 


P(Q3 \ Q541) — P(Qi \ Qn4i) = nsin? ¢ — sin? nd. 


1 


I= 
Show that for n > 1, f(¢) = nsin? ¢ — sin? nd has extrema where 
$ = (k+4)m/(n+1) for k € Zor ¢ is a multiple of r/(m—1). When 
¢ =7/2(n+1) show that f has a minimum value of 


. 2 us 
(n+ 1) sin B41) 1. 


Deduce that this minimum decreases with increasing n and find its 
value for n = 2, 3. 


11 Alternative formulations of quantum theory 


Heisenberg’s new work, which will very soon appear, looks very mystical, 
but It Is certalnly correct and deep. 


MAX BORN, letter to Albert Einstein, 15 July 1925 


11.1. Pictures of quantum mechanics 


The mathematical description of quantum theory rests on the idea of states 
and observables. However, the dynamical description of these two described 
in Section 6.4 is quite different. When the Hamiltonian is time independent, 
the states evolve according to the equation 

ve = Ueto, (11.1) 
where U; is the unitary operator exp(—iHt/f). Observables, such as the 
position and momentum, which have no explicit time dependence, are 
constant. (Observables may have explicit time dependence, for example 


X +#P. In this section a typical Schrédinger observable at time ¢ will be 
written as A.) 


Definition 11.1.1. This description of the dynamics of states and 


observables is called the Schrédinger picture. 


Heisenberg’s description of quantum mechanics is quite different: the 
states are constant and the observables evolve. 


Definition 11.1.2. In the Heisenberg picture of quantum mechanics 
the states remain constant y%, = wo, and an observable, described 


in the Schrédinger picture by A?, evolves according to the equation 
A; = Ur A? U;. 


In fact, both the Heisenberg and Schrédinger pictures of quantum me- 
chanics are special cases of the interaction picture, which compares the 
actual evolution with another evolution chosen to serve as a reference. 
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Definition 11.1.3. Let V; = exp(—iHot/fh) be the reference evolu- 
tion. In the interaction or Dirac picture the states and observables 
evolve according to the equations 


ve = Vi'Usbo 
At = V;t ARV. 


Remark 11.1.1. When V; = 1 we obtain the Schrédinger picture and 
when V; = U; the Heisenberg picture. 


The quantities that are of physical interest, such as expectation values, 


depend on both the states and the observables, and it is the evolution of 
these which really matters. 


Theorem 11.1.1. In the interaction picture the expectation of an 


observable A; in a state y; is independent of the choice of VY. 


Proof. In the interaction picture 


(Wel Aede) = (VU rbo|Ve AB VV, Uerbo) 


= (Urol AP Vio), (11.2) 
since V; is unitary and its inverse is V,*. Similarly (bilder) = (Uero|Ur%0), 
so that 

: (Wl) _ UrbolAfU.vo) is) 
lel? WWerol|? 
independent of V;. oO 


Corollary 11.1.2. The Heisenberg, Schrédinger, and interaction 


pictures all give the same expectation values. 


a a A 
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11.2. Differential equations for the time evolution 


In practice we have generally found the time evolution of wave functions 
using Schrédinger’s equation, so it is useful to know its analogues in the 
other pictures. 


Theorem 11.2.1. In the interaction picture the time evolution of 
states and observables is governed by the differential equations: 


2p be 
th di 


dA, OA, i 
Be oe Re 


where Hj = V;,* (H — Ho) V;, and GA;/dt = V;" (dA? /dt) Vi. 


= Hii; 


Proof. By definition 


int = HU; = UH, (11.4) 


ong qv. dV. 
a = ih = Vii (-—Ho) = —V;" Ho. (11.5) 
So, differentiating 7%, = V,*U:yo, we obtain 
d 
oe = Vi" (—Ho) Uito + Vi HU: v0 
= Vj" (H — Ao) ViV"Ue%0 
= Hity. (11.6) 


th 
th 


Similarly 


Ss 
ine = (—Ho) Vi" APY: + invye “hy, + VS APViHo 


Ss 
= inve Ey, — HoA; + ArHo 


OA 
= ih a [At, Hp] ’ (11.7) 
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and the second formula follows immediately. oO 


Setting V; = U; we obtain the following result: 


Corollary 11.2.2. In the Heisenberg picture where states are con- 
stant, observables evolve according to the equation 


dA _ 0A i 
dt” ot h 


(A, HI]. 


It follows immediately that a Hamiltonian which does not explicitly de- 
pend on time (so that OH/dt = 0) is actually constant. One can also 
deal with the case of time-dependent Hamiltonians by using the differen- 
tial equation of the corollary to define the evolution of observables in the 
Heisenberg picture. 


11.3. Time-dependent perturbation theory 


The interaction picture enables us to compare two time evolutions. This 
is particularly useful for complicated Hamiltonians, where Schrédinger’s 
equation cannot be integrated directly, but one can compare it with some 
simpler reference system which is supposed to be understood. The differ- 
ential equation of Theorem 11.2.1 can be integrated to give 


1 of" 
vt = Yo + aff Hip, ds. (11.8) 


If there were no perturbation, H’, then V; and U; would be identical, and 
we should be in the Heisenberg picture, where y is constant. We therefore 
take this as the starting point for an iterative scheme with 


wi” =v, (11.9) 

and ep 
ON) = abo + af Hi, Wine) dty, (11.10) 

th ity 


as the N-th order approximation, for N > 1. This expression can also be 
written as 


N t t2 
vf = vo + So ciny™ [ see Hi Hi, ... Hi, vo dtidtz...dtn, 
n=1 
(11.11) 
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as is readily proved by induction. Unfortunately, in some of the most 
important applications the integrals are not well defined, and even when the 


integrals are well behaved the sequence pi? often diverges. Nonetheless, 
this is an important theoretical tool. 


When it does converge we may rewrite it as a formula for the time 
evolution, 


a besd suk t te 
VeU, = 1+ 3° (ih) : mrtg ae: 5e <a oi (11.12) 
n=1 o 


Multiplying by V; we arrive at the following result: 


Proposition 11.3.1. (The Feynman-—Dyson expansion) The 
time evolution operator U; enjoys a formal expansion as 


co t te 
U= Vit yay [ ey Vint, H'Vi-1,.H... H'Vi, dt, 
n=1 0 Q 


. where H’ = H — Ho. 


Remark 11.3.1. The integrand in the n-th order term of the series 


Vi-t, HV, tay Vig-t, HV, (11.13) 
can be interpreted as describing a system in which the perturbation H’ = 
(H — Ho) is turned on only at times t1,t2,..., tn. In between the system 


evolves as though there were no perturbation at all, and the Hamiltonian 
were just Ho. 


11.4. Fermi’s golden rule 


Let us suppose that we start with a system that is in an eigenstate 
of energy ¢; for the unperturbed Hamiltonian. We wish to estimate the 
transition probability that after a time ¢ a measurement will show it to 
have evolved to an eigenstate ¢2 of different energy eg = e, — «. After time 
t the Schrédinger wave function is U,¢1, so that the transition probability 


1s 
K(b2lUidr) |? = (Vi b2]Vi" Urb) /? 
= |(e~*2"/* go 1V,* Udi) |? 
= |(ba|ViUib1)\?. (11.14) 
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As the eigenvector corresponding to a distinct eigenvalue $2 is orthogonal 
to $1, so we have 
1p ” 
(oalVieUsdn) = (Galda) + 5 f alHEVsUs61) ds 


=5 "(bal V Usb) a (11.15) 


From this we see that already the meade approximation to V;"Uz¢, will 
provide a second-order approximation to the transition probability. This 
gives as an approximate value 


t 
(balVUed) ~ = a (Ga|Hid1) ds 


1 
ih 


= ay, (Veb2, HoVsd1\ )ds 


ale HEY, ¢1) ds 


t 
age | expl—i (er ~ en) 9/A] (al Hoo) ds 


in 
iet/h ; 
= alti) 
e—tet/2h SINCE LAN Sie (o|H¢1). (11.16) 


From this we now deduce the transition probability by squaring the 
modulus. 


Proposition 11.4.1. (Fermi’s golden rule) The second-order ap- 
proximation to the transition probability after time t is 


\ealwadny? ~ (SP) eoteran?, 


where e is the energy difference between the two levels. 


Fermi’s golden rule confirms in a very explicit way the remarks about 
time—energy uncertainty relations made in Section 7.3. It is instructive to 
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FIGURE 11.1. The graph of sinx/z, whose square gives the approximate proba- 
bility amplitude for a transition from one energy level to another. 


plot this approximate transition probability against the energy difference, 
see Figure 11.1. At a given time t some transitions are much more probable 
than others. Viewed the other way round, any given transition is more 
probable at some times than others. One should not be unduly worried 
by the fact that the central peak has height [(¢2|H4¢1)t/h|? in defiance 
of the probabilistic interpretation for large t. This is, of course, just an 
indication that the approximation fails under those conditions. (It does, 
however, serve as a. reminder that our arguments have been rather cavalier: 
a proper mathematical treatment of the Fermi rule is far from trivial.) 


11.5. The harmonic oscillator in the Heisenberg picture 


In the Heisenberg picture the differential equation of motion differs from 
that of classical Hamiltonian mechanics only in the replacement of the 
Poisson bracket by i/f times the commutator. This means that it is often 
possible to solve the quantum problem by more or less the same method 
as in the classical theory. 

For the one-dimensional harmonic oscillator with Hamiltonian H = 
P? /2m + 4mw?X? one has 


dP i 

7 RP 
at 1 2 y2 
= 5 [ama 
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aan) )2 

= SS (XIX, P] + 1X, PIX) 

ean: 

ae (2ihX) 

= —mw?X, (11.17) 


and similarly 


ax _i 
dt 
= i 1 2 
“7, lam? x] 
4 
= 5-5 (PIP,X) + |P,X1P) 
=p (11.18) 


[H, X] 


m 
These look exactly like the classical equations of motion and can be 
solved by the same observation (Exercise 7.18) that 


SP + imwX) = —mw*®X +iwP = tiw(P + imwX). (11.19) 


This can immediately be integrated to give 
(P + imwX), = et™*(P + imwX)o. (11.20) 


The reader may wonder why we have concealed this extraordinary simplic- 
ity for so long. Unfortunately it is only for quadratic potentials that one 
finds such close agreement between classical and quantum mechanics; for 
anything more complicated the Heisenberg picture becomes considerably 
harder to handle. In any case more work is needed to obtain the bound 
state energy levels, and we postpone that to Example 11.10. 


11.6*. Statistical mechanical states 


In view of the strong role already played by probability in quantum the- 
ory it may come as a surprise to learn that we have not yet accounted 
for all the statistical aspects of physics. In practice physics is often con- 
cerned with large assemblages of particles of which we have only a very 
imperfect knowledge. We can scarcely hope to specify the states of all 
1022-1023 molecules in a litre of gas, let alone their constituent protons, 
neutrons, electrons, or even quarks. Even classical mechanics handles such 
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systems by means of statistical averages and we should like to do the same 
in quantum theory. There is a clear difference between the- probabilities 
that enter here as a means of covering our ignorance, and the probabilities 
encountered already in quantum theory. Consider, for example, a photon 
passing through a screen with two slits, and let ~; and 12 denote the wave 
functions corresponding to passage through the first slit or the second slit, 
respectively. When both slits are open the appropriate wave function is the 
superposition of these, ~ = 4, + we, and this leads to interference effects. 
Suppose, on the other hand, that a laboratory technician closes one slit 
decided by the toss of a coin but forgets to record which one. Probabilities 
again enter, but we would no longer see any interference patterns, since the 
photon encountered only one open slit, even though we do not know which 
it was. 

In order to handle this situation mathematically, we recall that vectors 
that are multiples of each other correspond to the same physical state. In 
other words, it is really the one-dimensional subspace spanned by w which 
is physically significant, and not ~ itself. This subspace is determined by 
the orthogonal projection Qy, that projects onto it: 


(ple) 
ake 


In fact, there are simple formulae for the physically significant quantities 
directly in terms of Qy. In particular, if A is an observable then 


Qyé = 


. (11.21) 


A A 
(€1Qy AE) = (€| we y) = a. (11.22) 
Summing over the vectors € in an orthonormal basis, we obtain the trace 
A 
tr(QyA) = Sa = Ey(A), (11.23) 


which shows explicitly how to recover the expectation of A from the pro- 
jection. 

Suppose now that we know only that there is a probability p, that 
the state is described by the vector 7, in the orthonormal set {ppijg= 


1,2,...,n}. The expectation value should then be given by the weighted 
average 


Do PiEys(A) = So str (Qy, A) = tr [Sn0..] (11.24) 
j 3 2 ; 


TSE THON AMES ROR EES NT STAD 


188 ALTERNATIVE FORMULATIONS OF QUANTUM THEORY 


This suggests that such a statistical system is best described by the operator 


p= > 5 Qys- (11.25) 
j 


This is a self-adjoint operator because the projections are self-adjoint and 
the probabilities real. It is positive because each p, > 0, and also 


tr(p) = >> pytr(Qy,) = Sop; =1. (11.26) 
j j 


Definition 11.6.1. A positive operator p that satisfies tro = 1 is 
called a density operator. 


Definition 11.6.2. In a quantum statistical system whose state is 
described by a density operator p the expectation of the observable 


A is given by 


E,(A) = tr(pA). 


In infinite dimensions it is already a strong constraint on an operator 
that it have a finite trace at all: the series that should give the trace is 
usually divergent. In fact, an operator that has a trace can be written in 
the form of a (possibly infinite) sum 5° A;Q;, where the Q; form a family 
of mutually orthogonal projections (that is, Q;Q, = 0 if 7 # k), and 


S- 1Ay] < 00. (11.27) 


Thus the form }>7;Qy, is essentially the most genéral possible. There . 


are, as we shall see in Section 11.11, some infinite quantum systems whose 
states are better described by a slight generalization of a density operator, 
but for most finite systems density operators suffice. 

If we know that the wave function is precisely y then we can take p = Qy, 
and effectively return to ordinary quantum theory. 


i 
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Definition 11.6.3. States described by one-dimensional projections 


p=Qy 


are called pure states. States described by more general density op- 
erators are called mized states. 


Thus all the states that we have used in the earlier chapters have really 
been pure states. There is an easy way to recognize density operators and 
to see which of them describe pure states. For this we recall that if the 


difference A — B between two operators A and B is positive then we say 
that B < A. 


Proposition 11.6.1. The density operator satisfies the conditions 


OS pp? Sp<i, 


The state is pure if and only if p? = p. 


Proof. It is clear that, as the square of a self-adjoint operator, p? is 
positive. For any orthonormal basis {3} we have 


(Hildy) = 1 = tr() = SO (vylows) = (Wylows). (11.28) 


j 


So 1—p > 0, and since p commutes with 1 — p and is also positive, we have 
p-p? =p(1— p) = 0. This has now established all the inequalities. 

It is also clear that if P = Qy is pure then p? = p. Conversely, if p? = p 
then p is a projection. By considering matrices with respect to suitable 
bases, the trace of a Projection is easily seen to be the dimension of its 
range. Thus p has rank tr(p) = 1, and is therefore a one-dimensional! 
projection, and so takes the form Qy, for some w. oO 


11.7*. Spin systems 


To develop a feeling for density operators it is useful to consider the case of 
a two-dimensional inner product space 7, which could describe a spinning 
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electron or the polarization states of a photon. We have already noted in 
Theorem 8.8.1 that any self-adjoint operator p can be written in the form 


p=4(Pol+P.o), (11.29) 


where Po = tr(p) and P; = tr(po;) for j = 1,2,3. 
For a density operator we require that tr(p) = 1, so 


p= (1+ P.o). (11.30) 


Definition 11.7.1. The vector P € R® is called the polarization 
vector. 


Proposition 11.7.1. The possible density operators for a spin 3 


system are characterized by a polarization vector P that lies in the 
unit ball in R°. The pure states are associated with polarization 
vectors lying on the surface, that is in the unit sphere. 


Proof. Applying Theorem 8.7.1(ii) to 29 — 1 = P.o we have 


4p? ~ 4p +1 = |P|?, (11.31) 

from which we deduce that 
4(p— 6) = (14 |PP). (11.32) 
It is therefore clear that p? < p if and only if |[P|? <1, and that p is pure 
(that is, p? = p) if and only if [P| = 1. o 


Any observable A is a self-adjoint operator and so can be written in the 


fi 
7 A=} (Aol + A.o). (11.33) 


We therefore have 


E,(A) = tr(pA) 
= tr($(1+P.o) 4 (Al + A.o)) 


TF 


rari 
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Remark 11.7.1. The state p= 31 is the density operator for an un- 
polarized state in which P = (P,, Pe, P3) = 0. Unpolarized light is well 
described by a state of this kind. Suppose that ~ is the state that can be 
transmitted by a particular polarized filter and let Qy be the projection 
onto the space spanned by 7. The probability that an unpolarized beam 
described by the density operator p will pass through the filter is 


tr(pQy) = (vlov)/IIvl|? = 2. (11.35) 
Thus an unpolarized beam has an even chance of being passed by any filter, 


11.8*. Gibbs’ states 


In the example of a box full of gas mentioned at the start of Section 11.6, 
the system would in practice settle into dynamical equilibrium in which the 
probabilities of different states depended on their energies. Josiah Willard 
Gibbs worked out the precise relationship for classical systems at the end 
of the last century, and his answer carries straight over to quantum theory. 


Definition 11.8.1. If H, the Hamiltonian operator for a system, 
does not depend on time, and exp(—@H) has a finite trace, then the 
state described by the density operator 


1 
= ooo —BH 
a (e~8#) ° 


is called the Gibbs state for inverse temperature £3. 


The inverse temperature corresponds to an actual temperature 1/kB 
K, where k ~ 1.4 x 10723 joules per degree is known as the Boltzmann 
constant. The Gibbs state is that appropriate to a system which has come 
into thermal equilibrium at temperature 1/k@ K. : 


Example 11.8.1. Asan example consider the one-dimensional harmonic 
oscillator with Hamiltonian H = P?/2m+ 3mw?X?. The eigenvalues of H 
are (n + 4) fiw, so those of exp(—##) are exp [— (n + 3) Bhw]. The trace, 
being the sum of the eigenvalues, is 


co -4Bhw 
do exp [- (n+ 4) fiw] = aa = dcosech (1Ghw). (11.36) 


n=0 
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The probability of finding the energy to be E, = (n + 4) fw is therefore 
exp [— (n+ 4) Bhw] x (1 - e Btw) ot bhw — e-nbhw (1 _ e-Ahw) | (11.37) 


We therefore have a geometric distribution of energies. 

Rather than considering harmonic oscillators it might seem more natural 
to start with freely moving particles. Physically this means setting w = 0 
to obtain the free particle Hamiltonian H = P?/2m. However, as w — 0 
the expression for the trace diverges owing to the cosech term. When 
the probability distribution expressing our ignorance is continuous rather 
than discrete, as in the case of the free particle, the states of the system 
cannot be described by a density operator. This is not really surprising 
since unrestricted freely moving particles are physically unable to come 
into equilibrium. For free particles in a finite box the trace is finite and 
equilibrium is possible. 


11.9*. The KMS condition 


There is a useful generalization of Gibbs’ states, which can be obtained 
by applying density operators in the Heisenberg picture. The connection 
comes from the relation between p and the time evolution operators: 


en BH. eiliBhEY/A = U_ ign, (11.38) 


Theorem 11.9.1. (The KMS condition) Let Eg denote the ex- 
pectation in the Gibbs state with density operator a multiple of 
exp(—@H), and suppose that the Heisenberg evolution of observables, 


A; = Uf AoU;, extends to complex values of t. Then 


Eg(ArB) = Ea(BAt+ipn). 


Proof. We first note that, since US = U_,, the Heisenberg equation of 


motion yields 
U_,At = U-sAr:U,U_5 = AtyeU-s. (11.39) 


Setting s = ih@, multiplying by B, and taking the trace, we obtain 


tr(e~? A,B) = tr(U_ign ArB) = tr(AzyignU_ipnB). (11.40) 
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Since tr(CB) = tr(BC) we deduce that 
tr(e~°# A,B) = tr(U_ignBAtsign) = tr(e~?# BAss an), (11.41) 
and dividing by tr(exp(—@H)) the result follows. a) 


Remark 11.9.1. This identity was first noted by Kubo, Martin, and 
Schwinger, whose initials now commemorate their contribution. In finite 
dimensions we can take 6 = 0, and then the KMS condition simply reduces 
to the identity tr(A,B) = tr(BA;) for traces. 


Gibbs’ states are not the only ones to satisfy the KMS identity: there 
are others, which are also useful in physics. Moreover, the KMS condition 
on its own can supply a wealth of interesting information about a system, 


as we play off its reversal of the order of observables against commutation 
relations. 


Example 11.9.1. Let us use the KMS condition to calculate the expec- 
tation value of the energy in a Gibbs state. For this we take A = P+imwX, 
B= P—imwX, and t = 0. Recalling equation (11.20) we see that 


(P + imwX)ign = e~ YP" (P + imwX)o. (11.42) 
Dropping the subscript 0, the KMS condition therefore states that 


Eg((P + imwX)(P ~ imwX)) = e “PES ((P — imwX)(P + imwX)). 


(11.43) 
Using Lemma 7.5.1 we can rewrite this identity as 
Eg(2m (H — $fw)) = e~“"E, (2m (H + dhw)), (11.44) 
which can be rearranged to give 
(1 — e~*P)Eg(H) = (1 + eM") (Lhw) . (11.45) 


The expectation value of the energy is therefore 
Eg(H) = phw(1 + oP) /(1 — eM) == thw coth(ABhw). (11.46) 


This can be checked using the exponential probability distribution of ener- 
gies derived earlier. 

This is the example that lay at the core of the problem that originally 
led Planck to introduce the quantum hypothesis. The energy difference 
between the expectation value of H and the ground state energy is 


Ee(H) — dhw = fw(e~P*) /(1 — e BAH), (11.47) 
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In the limit as & — O and quantum effects are suppressed, this tends 
to 1/6, which means that the average energy of each classical oscillator 
is independent of its frequency. The classical problem arose because it 
is easier to fit short wavelength (high frequency) waves into a given size 
of cavity, so if the energy per wave does not depend on its frequency, 
then most of the energy should be carried by the higher frequency waves. 
(The normal frequencies of waves in a cubic box of side a are given by 
w = 14/j2 + k? + 12/a, for integer j, k, and 1. The number of normal 
frequencies less than a given w is proportional to w?.) Planck’s law avoids 
this problem because the exponential damping factor, exp(—G@hw), in the 
formula more than compensates for the polynomial growth in the number 
of short wavelength waves that can be fitted into the cavity. 


11.10*. Partition functions and the harmonic oscillator 


The formula for the expectation value of the harmonic oscillator energy 
in a KMS state can be used to provide an independent derivation of the 
energy spectrum. According to Theorem 11.9.1 a Gibbs state satisfies the 
KMS condition, and so its mean energy must be given by 


tr(He-9#) 


“tr(e-BH) = zhw coth($Ghw). (11.48) 


This has a convenient reformulation in terms of the partition function: 


Definition 11.10.1. The partition function, Z(), is defined by 


Z(B) = tr(e~9*) , 


when the right-hand side makes sense. 


Remark 11.10.1. If the energy levels {F,, 2,...} of the Hamiltonian 
have degeneracies {d),d2,...}, then the partition function is 


Z(6) = > de oe, (11.49) 
n=1 


provided that the sum converges. 
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Differentiating the definition and using our earlier expression for the 
mean energy, we have ; 


Z'(B) __tr(He~6#) 28 


Introducing an integrating factor we obtain 


d., 
ap [sinh(}@hw)Z(8)] = 0, (11.51) 
so that for some constant C 


Z(G) = Ccosech(4 Bhw) 
= 2Ce Phw/2 (1 — e Phiw) 


co 
= 2C S> em (nt 8) Ah (11.52) 


n=1 


Comparing this with the general formula in terms of energies and degen- 
eracies shows that the energies can be expressed as E, = (n+ 3 )hw, and 
they all have the same degeneracy. 


11.11*. Algebraic quantum theory 


In Heisenberg’s picture of quantum mechanics the observables play the 
dominant role, whilst the state vectors are immutable and, except for their 
vestigial role in the calculation of expectation values, superfluous. This fact 
is exploited in a simpler though more abstract version of Heisenberg’s ap- 
proach, which uses only the algebraic properties of the observables without 
insisting that they should be linear transformations, and replaces states by 
the expectation values that they served to define. 

We have been using the following features of linear transformations on 
an inner product space: 
(i) The linear transformations, £(), themselves define a complex vector 
space, that is one can form linear combinations aA + GBB of observables A 
and B with complex coefficients a and B. 
(ii) One can form products of linear transformations and these distribute 
over sums so that £(H) is a ring in the sense of algebra. This ring has 
an identity, 1, and the product is related to the vector space structure by 
(a1)A = aA. , 
(iii) There is an adjoint map A+ A* which is conjugate linear, 


(aA + BB)" = GA* + BB", (11.53) 
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and satisfies (AB)* = B*A* and A* = A. 


Definition 11.11.1. Suppose that A has the structure both of a 
vector space and, with the same addition, of a ring with identity, 1, 
in such a way that, for all A € A and a € C, (al)A = aA. Then A 


is said to be an algebra. If in addition there is a map * : A— A such 
that (wA + BB)* = @A* + BB*, (AB)* = B*A*, and A*™* = A, then 
A is said to be a *-algebra. 


The algebraic formulation of quantum theory assumes that the observ- 
ables are described by the self-adjoint elements of a +-algebra, that is ele- 
ments A that satisfy A* = A. For future reference we note that 1* = 1*1 
is self-adjoint and so 1* = 1** = 1. 


Definition 11.11.2. A state on a *-algebra .A is a linear functional 
E;A— C that satisfies: 

(i) E(A*A) is real and non-negative for all A € A, 

(ii) H(1) = 1. 


Remark 11.11.1. These assumptions are based on the properties of ex- 
pectation values given in Proposition 6.3.1. The fact that H is a linear 
functional expresses the linearity of expectations, (iv), whilst the two as- 
sumptions correspond to (iii) and (i). The reality property (ii) need not be 
included since it follows from the other conditions. 


t 


Proposition 11.11.1. For any state, EZ, on a «algebra A, and for 
any A,Be A, 


E(A*B) = E(B*A). 


In particular E(A*) = E(A), so that E(A) is real for self-adjoint A. 
Furthermore, there is a Cauchy-Schwarz—Bunyakowski inequality: 


E(A*A)E(B*B) > |E(B"A)?. 


Proof. For any A,B €.A and 4 € C we have 


E((A+\B)*(A +AB)) = E(A*A) + |A)?E(B"B) + XE(A*B) + eee 
ae 
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The left-hand side of this identity, like the first two terms on the right-hand 
side, is positive and so real. This forces the sum of the two remaining terms 
to be real and gives 


\E(A*B) + XE(B*A) = XE(A*B) + XE(B*A). (11.55) 


Rearranging this we obtain 


\(E(A* B) — E(B*A)) = \(E(A*B) — E(B*A)), (11.56) 


which, since true for all complex 4, forces both sides to vanish and gives 
E(A*B) = E(B*A). Taking B = 1 gives E(A*) = E(A) and so, in partic- 
ular, when A* = A we deduce that F(A) is real. 

Finally, taking \ = t£(B* A) = tH(A*B) for real £ we see that 


0 < E(A*A) + @?|E(A*B)|? E(B*B) + 2¢|B(A*B)|?, (11.57) 


and the Cauchy-Schwarz—Bunyakowski inequality follows from the fact that 
the discriminant of this quadratic in t must be negative. im 


Curiously this abstract version of quantum theory is much closer to the 
normal formulation given in Section 6.3 than might at first appear. In 
the following result the term homomorphism means that it is both a ring 
homomorphism and a linear transformation. 


Theorem 11.11.2. (Gel’fand, Naimark, Segal) Let FE bea state 
on a *-algebra A. There exists an inner product space 7g, a unit 
vector Ng € Hg, and a homomorphism y: A ~+ L(x), such that 
for all AG A 


E(A) = (Qg|7(A)Qz). 


Proof. The space Hg will be a subspace of the dual space A’ of linear 
functionals on A.. We can define a homomorphism y : A — L(A’) by 
setting, for f € A’ and X € A, 


(WA) F) (X) = F(X A). (11.58) 
Then 


(1(AB) f) (X) = F(X AB) = ((B)F) (XA) = (1(A)1(B)F) (X), (11.59) 
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so that y(AB) = 7(A)7(B), whilst y(@A + BB) = ay(A) + By(B) follows 
from the linearity of f. 
We have yet to define the inner product, and we now restrict attention 
to the subspace 
Hy = {y(A)E: A€ A}, (11.60) 


which is clearly invariant under the action of any +(B). On this we define 
(A) Ely(B)E) = E(A*B). (11.61) 

This is well defined, since it can also be written as 
E(A*B) = (7(B)E) (A"), (11.62) 


showing that it depends on B only through 7(B)E. The preceding propo- 
sition tells us immediately that 


(7(A)E|y(B)E) = E(A*B) = E(B*A) = ((B)EWYA)E), (11.63) 


showing that the inner product depends on A through WA)E, and also 
giving the conjugate symmetry. The linearity properties of the inner prod- 
uct follow from the linearity of Z. It is also clear from the definition of E 
that 

(y(A)E\y(A)E) = E(A" A) 2 0. (11.64) 


Finally the Cauchy-Schwarz—Bunyakowski inequality tells us that 
\(y(A)E)(B*)/? = |E(B* A)? < B(B*B)E(A*A), (11.65) 


so that E(A*A) can vanish only if (y(A)Z)(B*) = 0 for all B, which 
forces (A) = 0, and shows that the inner product is strictly positive as 
required. ene! ; 

Finally, we note that the space 71g contains the distinguished linear 
functional £ itself, and that 


(E\y(A)E) = BQ" A) = E(A), (11.66) 
so thet E is the obvious candidate for the vector Qx. Oo 


Remark 11.11.2. Interpreted in one way this result shows that, despite 
appearances, the abstract formulation of quantum theory that we have 
just introduced is no more general than that which we have been using 
since Chapter 6. We can always find an inner product space, use operators 
on it as observables, and calculate expectations defined by vectors in the 


fade 
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usual way. However, this is slightly misleading for two reasons. The first 
is.that the space Hg is often much bigger than one would otherwise need 
and not every linear transformation is of the form (A). For example, 
when we define E(A) = tr(pA) for some density operator p the space Hg 
is much larger than the space on which p acts. This is inevitable, for in 
the large space 71 all the statistical uncertainty included in p has been 
banished and states are represented by vectors again. We should not find 
this at all surprising, for if we were dealing with the statistical behaviour 
of molecules in a box we know that there is a larger space describing the 
quantum mechanics of each and every molecule in detail where statistical 
uncertainties disappear. However, we chose to work with density operators 
precisely to avoid getting entangled in that kind of detail. 

The second reason for caution in interpreting the significance of this 
theorem is that in many important cases different states of a given physical 
system may lead to inequivalent spaces Hg. It is the algebra A which is 
associated with the physical system, not the inner product space. For 
example, a ferromagnetic material is described by different spaces, Hz, 
according to whether it is magnetized or not. The algebra, however, is the 
same. 

Even this algebraic formulation of quantum theory is not the most gen- 
eral. Some physical observables, such as times when a photon hits a 
counter, cannot easily be interpreted as elements of the algebra, and for 
them there are still more general approaches. 


Exercises 


11.1° A particle is free to move in the z direction so that its Hamiltonian 
#f is given by 


H = P?/2m. 
At time t = 0 the system is in a state with the wave function 
ee Ne~2 /40? 
where N and o are real constants. Find the initial expectation values 


Ey(X?), Ey(P?), and Ey(PX + XP). Using the Heisenberg picture 
or otherwise show that at time ¢ the expectation of X? is 


We 
4m2o? Ors 


and find the expected values of P? and XP + PX. 
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11.2° A particle moving in one dimension has the Hamiltonian 


2 
P4v(x). 
2m 


Using the Heisenberg picture or otherwise show that under suitable 
assumptions the expectation values of position and momentum sat- 
isfy 

d _ E(P() 

SE(X(t)) = =, 


d Pate 7 ; 
qElP)) = -E(V"(X(@))) 


11.3° The Hamiltonian for the one-dimensional harmonic oscillator is given 


‘ Eo 


sal 


(P? + m2w?X?) . 
Show that 
E (XPen/aw) +E (X?) 
is constant. 
11.4° The Hamiltonian for a one-dimensional system is given by 
. 2 
H= ite —mkX, 
2m 
where k is a constant. Show that the expectation of the position 
observable at time t is given by , 
I P(0)t 
E(X(t)) = git + (70H +X(0) }. 
Show further that 
_, A(X(t)) _ A(P(0)) 
lim ——+ = —_—. 


t—+0o t m 


iltoni inni is L3. By considering the 

.5 The Hamiltonian for a spinning body is Lg st 
o expectation value of L,L_ in a KMS state show that the partition 
function 7(() satisfies the differential equation 


2" + hcoth($@h) Z’ — An?Z = 0, 
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11.6° 


11.7° 


where Afi? is the eigenvalue of L?. By substituting f = sinh(3@h)Z, 
or otherwise, deduce that : 


z= Asin (J+ Dp) / sinh (2h) , 


where A is a constant. By considering the case of 8 = ~it and using 
periodicity in t, deduce that Z is a multiple of 


sinh{(! + 4)gA] 
~ sinh(Z Bh)” 


for some non-negative integer 1. 


Starting from the Schrédinger picture define the Heisenberg picture 
for a system in which the Hamiltonian, H, does not depend explicitly 
on time. Obtain the equation of motion satisfied by an operator that 
is not explicitly time dependent, and deduce that H is constant. 

A particle of mass m moves along the x-axis under the in- 
fluence of a uniform electric field with potential Fz. By using the 
Heisenberg picture, or otherwise, show that the dispersion of P is 
independent of time and find an expression for the dispersion of X. 

Show also that E(P)? 

P 
a + FE(X) 
is constant during the motion, where E(A) denotes the expectation 
value of the observable A. 


The operators J,, Jo, and J3 representing the components of angular 
momentum are hermitian and satisfy the commutation relations 


[J2, J3| = th), [Js, Ji| = thJo, [Ji, Jo] = thJs. 


Let S = J, —iJo and let ®,, be an eigenvector of J3 with eigenvalue 
mh, where m > 1. Show that 


JSG = (Mm —1)\hSOan. 


The Hamiltonian for a quantum mechanical system is given by 


4,3 R 
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12 Stationary perturbation theory 


where A and C are positive constants. Obtain the equations of mo- 
tion for Js and S$ in the Heisenberg picture. By making the substi- 
tution 


S(t) = exp i G - 3) a T(t) exp f G ~_ 5) en) ' 


or otherwise, find S(t) in the Heisenberg picture in terms of J3(0) 
and S$(0). Initially the system is in an eigenstate of J3. Show that 


A beautiful Christmas and a good New Year filled with hydrogen transition 
probablilitles, theory of helium etc. 


WERNER HEISENBERG, Ietter to Wolfgang Pauli, 24 December 1925 


the expectation value of J? is constant. 12.1. Rayleigh—Schrodinger perturbation theory 
11.8° A particle with charge e forming a linear harmonic oscillator with : ; ; 2 & : : 
unperturbed Hamiltonian For most physically interesting systems it is not possible to find simple 


closed formulae for the energy levels and wave functions. Generally the 


P21. pasg best that one can do is to find numerical approximations and iterative 
Ho = am + ge x schemes. Since Schrédinger’s equation is a differential equation there are 
many standard numerical methods that can supply approximate solutions, 
is placed in a weak electric field along the z-axis given by but there are also various special techniques tailored to this particular 
situation, which we shall describe over the next chapters. 
F(t) = A? / 7 The most obvious technique is to try to compare solutions of the equation 
iV | 
where A and 7 are constants. At t = —co the oscillator is in its Hy = Ep (12.1) 


ground state. Find to a first approximation the probability that it 


ill b ts first ited ‘state at b= eo with the solutions for a more tractable Hamiltonian Ho. One natural way 
will be in its first excited state at t = ‘ 


; to link the behaviour of two Hamiltonians Ho and H is to consider the 

11.9 Show that any operator of the form p = 3>p;Qy,, with p; € [0, 4], family 
is positive, that is A, = (1—u)Ho +uH = Ho + ul’, (12.2) 
(plow) 2 0, 


j where H’ = H — Hp is the perturbation, and the real parameter u just 
: controls the strength of the perturbation. As the notation suggests, H, = 
Ho when u = 0, whilst Hy, = H. We would hope that each Hamiltonian 
has associated energy levels E,, and eigenstates 1, so that 


for any vector yp. 


11.10 For any inner product space H, let S denote the subspace of opera- 
tors, A, for which tr(A™A) is finite, with the inner product 


(A|B) = tr(A*B). Hutu = Eutdy. (12.3) 


Show that 2 = >> ./p;Qy, is in S, and that it is related to p = We shall assume, for convenience, that to is normalized, and that 
3 : 


XP7Qu,; by (Wolbu) = 1, (12.4) 


tr(pA) = (M|AQ) /|]Q\)?. 
for all u in the interval under consideration. (If 7, is to be a good approxi- 
mation to wo then we expect (oly).) 3 0; indeed, since it is 1 when u = 0, 
there is an interval around 0 in which it does not vanish. Replacing yw, by 
tu/(o\|u) there, we may as well assume that (Po|~u) = 1.) 
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Rayleigh-Schrédinger perturbation theory proceeds by supposing that 
both E,, and ~,, have power series expansions, 


Ey = Eg tuk +uvE" +..., (12.5) 
tu = Vo tuy! tu" +... 


This is a very strong assumption, which can, nonetheless, sometimes be 
justified. When it is valid, one can simply compare coefficients of u in 
Schrédinger’s equation, 


(Ho + uH") (Wo + up’ +...) = (Eo + ub’ +...) (vo tu’ +...) (12.6) 


to obtain a sequence of equations for the various terms. The constant term 
gives back the unperturbed equation 


Hovo = Env, (12.7) 


showing that Wo and Ep are an eigenstate and energy level of the unper- 
turbed problem. The coefficient of u gives 


How! + Hho = Eo’ + E'vo, (12.8) 


for the first-order corrections EH’, y’ to the energy and eigenstate. 


Theorem 12.1.1. Suppose that ¢1,...,¢@p form an orthonormal 
basis for the Ep-eigenstates of Ho with respect to which yo can be 
written as Wo = > ¢r¢r. Then the column vector of coefficients 
(c1,...,¢p) is an eigenvector of the self-adjoint matrix with entries 
(¢,|H'¢,) with eigenvalue EB’, so that 


det((¢,|H’¢,) — E’6,5) = 
In particular, when Eo is non-degenerate, one may take yo se o2 and 
E' = (%0|H’o). The corrections to the wave function, y’, ~",..., 


can be chosen to be orthogonal to wo, and the first-order correction 
is given by a solution y’ of 


(Eo— Ho) b' = (H' — E’) yo, 


which is orthogonal to po. 


RAYLEIGH-SCHRODINGER PERTURBATION THEORY 205 


Proof. Taking the inner product of equation (12.8) with ¢, we obtain 


(brlHoW’) + (brlT’do) = Eo(drlW’) +B" (belvo). (12.9) 


Exploiting the self-adjointness of Hy and simplifying we see that 


(¢r|How’) = (Ho¢,|v") — Eo(¢r|¥"), (12.10) 


so that the first terms on each side cancel leaving 


(¢r|H"o) = E'(b-|Wo). (12.11) 


Substituting the expansion of yo in terms of the basis {¢r}, and using 
orthonormality, now shows that 


D 


Yi (brlH bs)cs = Eley, (12.12) 


s=1 


giving the stated eigenvector property. The determinant equation for E’ is 
just the condition for this equation to have a non-trivial solution. The fact 
that the matrix is self-adjoint follows from the identity 


(rl H's) = (H'brlbe) = (bs) H'br)- (12.13) 


When £p is non-degenerate, so that D = 1, Yo = c11, and we may choose 
c, = 1. The condition on E’ also reduces to 


E" = ($;|4'¢1) = (volH’Yo), (12.14) 


as asserted. 
Substituting the series into equation (12.4) gives 


Sou" (voip) = 1, (12.15) 


and taking the coefficient of u® for n > 0, we see that ~™ is then orthog- 
onal to yo. o 


Remark 12.1.1. There are more hidden assumptions lurking in the back- 
ground of this chapter than of most others. Examples show that the do- 
mains of the operators H and Ho may intersect trivially, so that H — Ho 
is only defined on the zero vector. Even when its domain is a larger and 
more interesting set there can be problems in establishing some of the later 
Properties. Nonetheless, perturbation theory is an invaluable theoretical 
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tool, and there are many genuine examples for which its use can be jus- 
tified. We shall therefore proceed formally assuming that the difference, 
H' = H — Ho, makes good sense, and that it is, in some sense, small. 

Suppose that whenever Hoy is defined so is H’y, and for some constants 
a and b independent of w one has the inequality 


|H’p|| < al|Hopll + bly); (12.16) 


then there is a positive R such that F, and w, are analytic functions of u 
within the region where |u| < R. The helium atom (Section 12.3) falls into 
the category of examples covered by this criterion, so that in that case the 
Rayleigh—Schrédinger method can be justified. 


12.2. Examples 


To demonstrate the :nethod we shall first consider an example that can be 
solved exactly, so thy, wi: nay check our results. We take the Hamiltonian 
H for a two-dimensice:::.! osvillator in a potential V = $mw?(x?+y?) +uay. 
Diagonalizing the pote::tial energy, as in Section 3.2, gives 


mu — u _ 

act ( an Ages) = 0, (12.17) 

so that A = mw? + u, and the frequencies, \/\/m, are given by 

1 
u \2 u 
= —z) = ——st...}. 12.18 
ai w(1+ oa) w(lts at ) ( ) 
The true energies are therefore of the form 
I uh 

(m4 + $) hwy + (n- + 3) fw = (ny +n +1)fiw + (ny —n_)om— +o 
(12.19) 


We shall now compare the two lowest energy states with those of the 
isotropic oscillator whose potential is $mw?(z?+y"), taking uy as the per- 
turbation. The unperturbed Hamiltonian Ho has energy levels (n} + $)hw+ 
(nq + 4)fiw = (ni + n2 + 1)hw, with eigenfunctions Yainy = Oni (Z)Gn2(y); 
where ¢,, is the wave function for a one-dimensional oscillator. For the 
non-degenerate ground state Yoo, we have immediately 


E' = (y00|H"po0) 
a if 0(z)¢o(y)z¥¢do(x)¢o(y) dxdy 


= [ ido(a)Pax [ vido(w)Pav. (12.20) 
R R 


roe 


= 


SBI Oy eS Bg a a a iN a ae ee a 
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Now z/¢o(x)|? is an odd function so its integral must vanish, leaving E’ = 0. 
This agrees with the exact solution above, which has no first-order term in 
u when ny =O0=n_. 

The first excited state is doubly degenerate with the orthonormal basis 
10 and Yo1, so that we must consider the matrix with entries (y,1|H’ Yrs) 
where k+1=1=r-+s. Arguing as for the ground state, each diagonal 
element is of the form 


(eral ere) = [ sld-(a)l? dx | alda(v)l dy, (12.21) 


and since the integrands are again odd, this vanishes. By Theorem 12.1.1, 
we know that the matrix whose eigenvalues and eigenvectors are sought is 
self-adjoint and so has the form 
0a 
e 5) (12.22) 


The eigenvalues are easily found to be E’ = +a], and eigenvectors (1, +1), 


so that, on normalizing, vo = (y~o1+10)/V2. To find the energy correction 
we note that 


a = (¥oi|H’ 10) 
-| x$o(z)¢1(zx) as [ y1(¥)do(y) dy 
R R 


2 


= | [ sPaeeu(e) dz 


(12.23) 


It follows from Theorem 7.7.3 and equation (7.50) that the first excited 
state is related to the ground state by ¢(xz) = (2mw/h)? x¢o(zx), so that 


| 2F0le)ou(a)ae = yh f Ios a)P ae = we (12.24) 


and E’ = +\a| = +h/2mw. This agrees with our precise formula when 
nz = 1 and n_ =0 or vice versa. 


12.3. The ground state of the helium atom 


We shall now apply perturbation theory to a more interesting example. 
The helium atom has two electrons orbiting a nucleus of charge 2. By 
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2 2 3)" 
changing coordinates we can separate out the centre of mass and obtain ae (ri + rg — 2rire cos 6) 
the Hamiltonian T1T2 A 
2 2 on 
ne ae a ms) 12.25) = — [(r1 +12) - [ri — ral] 
eo (--¥! — BY? ~ Geers Greota * Gmepiri— tal)’ | rire 
4nfro ifro>r 
where r; and re are position vectors for the two electrons. Apart from = ian ee - a a (12.31) 
the final term, which describes the repulsion between the two electrons, > 
this looks just like the sum of two hydrogen-like Hamiltonians with nuclear a result well known in potential theory. 
charge Z = 2, so we take Since this no longer depends on the angles, the integration over the 
P P 7 possible directions of r; just multiplies by 4m. The integral for (o|H’vo) 
oa _ 2 + _f ye eZ ) (12.26) therefore reduces to the sum of 
Ho = “Om 3 daegry 2m? Areore 


ZB 2 e2 . oo poo 4 eee 

aay —e24(r /a,.2 2 
It will be useful to work with a general charge Z, as we shall need the ( =) a (47) ( [ | =e (ritra)/a,2 dror; ar) (12.32) 
results of the calculation again in Chapter 14. : 


The lowest energy state of Hp is non-degenerate and according to 4.3.1 and a similar term with r and r2 interch anged, giving all together 


is described by the product of two hydrogenic ground states: — ee 
Z Pa ~2Zr2/a ~2Zri/a,2 
3\4 3\3 32 { — Z e To dre rydr,}. (12.33) 
é (r : ) (2 ) e74r1/a ( Z ) e72t2/0 a TEQ 0 re 
o(¥1,92) = | ——7 — 


3 
nas ae A simple integration by parts argument (or repeated differentiation with 


2 a eo Zritra)/a, (12.27) respect to k of the n = 0 case) shows that 
mwas 
; imati d f helium therefore starts opti 14 241) | ckR SO 1 ; 
The obvious approximation to the ground state o en Fe dr = nik-@+) |e 2 =(kR)! |}, (12.34) 
with Wo and energy Ey = —Z%e”/4zega, which is the sum of two hydrogen- R al 


like ground state energies. 


The perturbation is so that, on doing the first integration, the previous integral reduces to 


2 
Bie pee (12.28) 6 2 Shr Ree 
4neg|r; — rel 32 (2) ro (<5) If ( + ot ettnleran| : (12.35) 
To find the first-order correction to the energy we need to evaluate he 0 vg 
: 4 , Using (12.34) with R = 0 to do the final integration, we arrive at 
Z ‘ e ~2Z(ritra)/a g3y, 3 
7 oo ———__———-e Pyare. 
(Wolo) = (3%) = Areo|r — Fol 3(Z)° 2 [oy (4) 431 ( 22 (2° _ 5 Ze? (12.36) 
; (12.29) aj 4reg | \4Z ‘La 4Z 8 4rena" ; 


dence occurs in the term 
sa a Recalling that in this case Z = 2, we see that the first-order estimate of 


jri —¥e|7? = (r? +r? — 2rire cos 6)? ; (12.30) the energy is therefore 
: 2 2 
where @ is the angle between ri and ra. Considering just the angular part con (1 _ 5 ) = —0.6875—~—. (12.37) 
obtained by integrating this term over r2 we have EQa 16 TEA 


Compared with the experimental value of —0.73e?/mega this is about 5% 


7 on 2 eS ; ‘ 
i) H (r? +r? — 2rirecos@) 7 sin 6 ddd to high, 
o Jo 


SAR AEN SRR ER RRR RTPA SIENA ARNESON OT I A AR > 
[Per Porereurcemenrepet US one ar EH CRN Re STENT RE NRE SASS IMP 7 : 


Ee rile a an 
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12.4. Higher order Rayleigh—Schrédinger theory 


For simplicity we shall consider higher order corrections only when the en- 
ergy level Ep is non-degenerate, so that the Eo-eigenvector, Po, is uniquely 
determined up to multiples. 

The coefficient of u* in the expansion of the Schrédinger equation gives 


Hop + HpF-) = Eg) + B'pO-) 4. + BM yo, (12.38) 


and, taking the inner product with wo and using the orthogonality of wo 
and 9), we obtain 


(vol How) + (WoL p*-Y) = Eo (ol) +B) (Wolvo). (12.39) 
As in the first-order case the first two terms cancel (both vanish), leaving 
(Pol H'p*-)) = B® |lyo|l?. (12.40) 


Actually this is only one of many formulae for E(*) that can be obtained 
from the following useful result. 


Lemma 12.4.1. For any u # v we have 


Ey, — Ey 


(alts) = (F222) (valve. 


U-wU 


Proof. We first note that 


(tu|(Hu —Hv) bv) oat (Hudulby) — (ul Hobe) ca (Bu Ey) (bulby), (12.41) 


from which the result follows on dividing by u—v and noting that H,—H, = 
(u—v)H". Oo 


By comparing coefficients of v*-! we obtain our previous formula for 
E,, but there are many other possibilities. 


Corollary 12.4.2. The energy correction E(4+)) depends only on 
the lower order energy corrections and the first k corrections to the 
wave function. 
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Proof. We first note that 


(A=¥) - Same —e = ye (Ss yn ) : (12.42) 


n=1 


Comparing coefficients of u*v* in the lemma, we see that there is a term 
E(2k+1) |Iyo||?, and all the other terms involve lower order energy corrections 
and the wave functions up to p‘*), Oo 


The wv coefficient, which after some simplification yields 
(b|H'p") = EB" \\yoll? + EN v'?, (12.43) 


shows, in particular, that E’” depends only on #’, and not on ~”, as the 
earlier formula would have suggested. 

Nonetheless, even with this improvement, we need 7’ in order to get 
beyond the first-order energy correction. 


Theorem 12.4.3. Suppose that {1, #2,...} is an orthonormal basis 
of eigenvectors for Ho, with corresponding eigenvalues H,, # Eo, for 
a #0. Then we have 


y= yo (bolo) wo) a ees 


a0 Eq — Ea 


= 5 Wvel vot (heel FT" vo) tele 


a0 Eo c 


Proof. The vector 7’ is orthogonal to 7 and so its expansion with re- 
spect to the orthonormal basis takes the form 


y= Yo Wal’) pa. (12.44) 
aX 


Now, the inner product of the first-order equation with Wa gives 


(Val How’) — Bo(balth") = B'(baltbo) — (pal Ho). (12.45) 


t 
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The first term on the right vanishes, because yw and wo correspond to 
distinct eigenvalues and so are orthogonal. By the usual argument we also 
have 


(valHov") = (Hovalt’) = Ea (val?'), (12.46) 
so that 
(Eo Ea) bald’) = (hel Hho). (12.47) 
Substituting this into the earlier expansion we obtain 
1 Sr Mal Elo) 
f= » iE, ¥* (12.48) 


as asserted, and substituting this into E” = (y%o|H’y') gives the formula 
for the second-order energy correction. o 


12.5. The Berry phase 


Rayleigh-Schrédinger perturbation theory is concerned with the behaviour 
of a parametrized family of Hamiltonians, {H,,}, for small changes in the 
parameter, u. Recently it has been realized that subtle changes can occur 
even when the Hamiltonian returns to its original form after a series of 
changes. 

We need not restrict ourselves to a real parameter wu as we did in per- 
turbation theory, but rather take u to lie in a subset U of R”™ for some 
n> 1. It will be useful to introduce a different normalization rule to that 
of perturbation theory, one more appropriate to large parameter changes. 


Definition 12.5.1. Vectors will be normalized so that ||yo|| = 1, and 


for j = 1,2,...,7. We shall often abbreviate this to 


(dulgrad(yu)) = 0. 


This normalization rule arises naturally for a system whose Hamiltonian 
is changing, but slowly enough that the state can adjust and always be in 
an eigenstate. (This is the adiabatic approximation.) This rule specifies 
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that infinitesimal changes to y, are orthogonal to 7), whereas the previous 
rule ensured that all changes were orthogonal to yo. We also have 


to] 
Bay ale) = (bul SOY) + (Sa) = 0, (12.49) 


Ou; Ou; 


so that ||P ||? is a constant, and yy can remain normalized for all u. How- 
ever, the rule does more than just ensuring normalization: it also fixes the 
phase of 7, as the following result shows. 


Proposition 12.5.1. Suppose that, for each u € U, E,, is non- 
degenerate, and let ¢,, be a normalized E,,-eigenvector of Hy. Then 
¢u can be expressed as A71y, where 


‘Sis eep (- [ (tuldes) 


C is any curve in U that joins 0 to u, and 


(du lddu) = > Xdul 


j=l 


Proof. Since E,, is non-degenerate, gu must take the form A7}Wy, for 
some A, € C \ {0}. By the normalization rule 


fig OMA eg Oh 
O= uduls bu +05) 
=e Or 2 Obu 
=%e (Fal? + ule). (12.50) 
Since A # 0 and ||¢,,|| = 1 we have 
Or Obu 
oul a ~ (bul 5 (12.51) 


Integrating 


ae A (buldgu), (12.52) 


— = < Seema Son eee at 


ae i a ae ee rae 


Sa al meee ath edi aeirebntireeeetemeatinea ei intenaaunemnemrteearcientee ieee 
ae 593 "3 rea see erry rain Tame a 
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whence the result follows by exponentiation. i) 


The case when we return to 0 along a closed curve C is especially inter- 
esting. 


Corollary 12.5.2. If u traverses a closed curve C then the wave 
function is roultiplied by the phase factor 


exp (- is (bald) 


Definition 12.5.2. The factor exp(— f,(¢ulddy)) is called the Berry 
phase factor associated to the curve C’. 


If the closed curve C can be spanned by a surface S C U, then Stokes’ 
theorem allows us to express the factor as 


a6 | 09 
exp (= [Seige reusau) 


The general existence of this large scale phase factor for adiabatic systems 
was noted by Michael Berry in 1984. A phase factor of this kind occurs for 
many systems, and in particular for polarized photons in an optical fibre, 
where it was demonstrated experimentally a few years later by Tomita and 


Chiao. 


(12.53) 


12.6. The Bloch—Floquet theorems 


Consider a particle moving on the x-axis under the influence of a potential, 
V, that is periodic of period a. (This might provide a simple model of 
a crystal in which the atoms are regularly spaced.) Let us introduce the 
family of Hamiltonians 


nh? d 


A, = om dat + V(x — ua), (12.54) 


for u € (0, 1). 
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Since V is periodic H; = Ho, and we have a closed curve in the space 
of Hamiltonians. If ¢ is an eigenfunction of Hp with energy E then clearly 
¢u(xz) = (x — ua) will be an eigenfunction of H,, with the same energy, 
since we have only changed variable. 


By the chain rule 


Fatele) = ~a6'(2 — ua 
= -$P9(z — au) 


= -= Pu, (12.55) 


so 


(Gulddu) = —ZaulbulPou) = ~ZEg, (P)du 


= (12.56) 


By changing variable the inner products with ¢ and ¢,, are the same, so 
the expectation values for the two states are the same. On integrating from 
u =0to u= 1 we obtain the following result: 


Theorem 12.6.1. (Floquet’s theorem) The Berry phase factor 
for a particle moving in one dimension in the presence of a potential 
that is periodic of period a is 


exp (~ [(uldda)) = exp (2E6(P)). 


This property of ordinary differential equations with periodic coefficients 
was discovered by Floquet, and the factor is known as the monodromy. One 
important consequence of the formula for the monodromy is that it is the 
same if Ey(P) is increased by an integer multiple of 27f/a. This gives rise 
to a periodicity in momentum space. 

Bloch discovered the corresponding result in quantum theory, and gen- 
eralized it to three dimensions, where one has a crystal lattice spanned by 
three vectors ay, a2, and ag. We shall use square brackets denote triple 
scalar products, so that [aj,a2,a3] = ay. (ae x ag). 


Se tera nad eaten wenn ta en AY AAA a snaine tor hate Metin one Ne aA AMR ON mA He AAPA de 4 OREN He NS. te aS gd ml me LL A ON BRE ees: Kubo asaady Ne MSR 0a ce yn 


nak she8d Ba 
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Theorem 12.6.2. (Bloch’s theorem) Suppose that a particle 
moves in R® in the presence of a potential, V, that satisfies 


V(x+a;) = V(x) 


for 7 = 1,2,3. Translation through n = nya) +nga2+ngag multiplies 
the wave function by a factor 


exp (;£sm.P)) ; 


This time the factors are the same (for all n) if the expectation of P is 
increased by a vector in the reciprocal lattice spanned by ag x a3/([a1, ag, a3}, 
ag X a;/{a1, a2, 8s]; and ay x a2/[ai, a2, asl). 

The Bloch theorem is fundamental to much solid state physics, since 
many materials are crystalline, and so have periodic potentials. The peri- 
odicity in momentum space when the momentum is increased by a vector 
in the reciprocal lattice gives rise to a band structure in the permissible 
energies. When the bands are separated by wide gaps, considerable energy 
has to be expended to raise an electron from one band to another. If a 
band is full, this can make it difficult to accelerate its electrons to give a 
current, and the material behaves as an insulator. If a band is only partly 
full the electrons can easily be raised to higher energy states within the 
band, and a current is easily produced, so that one has a good conductor. 
Between these two extremes are some semi-conductors, which have bands 
which are very narrow or overlap. 


12.7. Historical notes 


Schrédinger was able to use perturbation theory, an idea borrowed from 
classical mechanics, to explain some of the features of Johannes Stark’s 
observations of the spectrum of atoms in an electric field. Schrédinger 
added the potential H’ = F.r for a uniform electric field F to the hydrogen 
atom Hamiltonian Hp, and was able to show that the degenerate excited 
energy levels split. The change in the energy levels reflects the distortion 
of the atom caused by the field. It is this distortion which is responsible 
for the fact that the dielectric constant, «, within matter differs from its 
value in empty space. 

Mathematically, this example has to be treated with great caution. Ac- 
tually, E.C. Titchmarsh showed that the Hamiltonian H has no eigenvec- 
tors, so that we can hardly expect to find them by perturbation theory. In 
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fact, the series By + u&’ +... can be proved to be divergent. One should 
really expect this physically. Quantum mechanically the electron can tun- 
nel through the potential barrier, and then the electric field will accelerate 
it away from the nucleus. This is just saying that the field can ionize the 
atom by snatching its electron. Truly bound states are thus impossible and 
so there are no energies to compute by perturbation theory or any other 
means. However, a detailed calculation shows that for realistic fields F this 
dissociation process takes a very long time. On ordinary timescales the 
wave functions are very nearly time independent, and behave as though 
they described bound states. It is these ‘metastable states’ whose approx- 
imate energy the perturbation theory evaluates, and the coefficients of the 
perturbation series encode important physical information. 


Exercises 


12.1 Prove formula (12.43) for EB”. 


12.2° A quantum mechanical system has Hamiltonian 


Pe 1 ey 4 
A= — + =m X* + dX", 
2m 2 
with a small real parameter. Show that to first order in 4 the 
energy eigenvalues calculated using perturbation theory are 


2 
(n+ 3) fw + (2n? + 2n +1) ue 


[Hint: You may find it useful to use the creation and annihilation 
operators and the relation [a_, at] = na%~, for n =1,...,00.] 


12.3° A quantum mechanical system has Hamiltonian 


rece eee 
Ap = om + Tiklsed x’. 
Calculate the second-order perturbation theoretic corrections to the 
energy levels for each of the following perturbations: 
(i) H’ =X; 
(ii) H’ = €P; 
(iii) H’ = «(PX + XP). 
Where possible compare your results with the exact answers. 


12.4° Find the eigenstates and energies for a particle of mass m which is 
confined to a two-dimensional square box 0 < 2 <a,0<y <a. 


ee ee 


218 


12.5° 


12.6° 


12.7° 


12.8 
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Comment on the degeneracy of the ground state and first excited 
state. If the system is now subjected to a small perturbation with 
potential energy V = exy, find the energy change of the ground 
state and the first excited state to first order in ¢. Construct the 
corresponding zero-order wave functions for the perturbed system 
for the first excited state. 


In a model of the hydrogen atom that allows for the finite size of the 
nucleus, the electron moves in the potential 


_ f—-Ze?/4nrenr r>R 
VOUS { —Ze* /4neoR O<r<R. 


If Ho is chosen to be the Hamiltonian for the electron in a Coulomb 
field due to central charge Ze, find the first-order correction to the 
ground state energy, and show that for R < a it is quadratic in R. 


The Hamiltonian of a rigid rotator in a magnetic field perpendicular 
to the y-axis is of the form 


AL? + BL3+CL1y. 


If A and B are very much larger than C find the energy eigenstates 
correct to first order in C and the energy eigenvalues to second order. 
Compare your results with the exact answer. 


The energy levels of a hydrogen atom with Hamiltonian Ho are 
known to be #, for each eigenstate Ynim, where n is the princi- 
pal quantum number and the labels / and m are associated with the 
total angular momentum and Z3. In the presence of a weak magnetic 
field the Hamiltonian becomes 


B 
H = Hy - Ls. 
2p 


Describe the splitting of the first excited energy level, giving the 
degeneracies. 


A particle of mass m moves in two dimensions under the influence of 
a potential mw?((1+€)z?+(1—«)y*). What are the possible energy 
levels if € = 0? When e does not vanish, use degenerate perturbation 
theory to calculate the corrections to the energy of the first excited 
state to order € and the corresponding unperturbed states. Calcu- 
late the energy levels of the system directly and compare the exact 
answer with that given by perturbation theory. 
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12.9° 


12.10° 


12.11° 


[The normalized wave functions for the two lowest states of the 
harmonic oscillator whose Hamiltonian is P?/2m + mw?X? are 
bo = (ma?)~4 exp(—x?/2a) and V2(x/a)o(x) with energies thw 
and 5fw respectively, where a? = h/mw.| 


A particle of mass M moves in the zy-plane under the influence of 
a potential «UV, where ; 


_f[ry for0<a<n,0<y<r 
Ua fe otherwise 


and « < f?/M. Show that when € = 0 the lowest two eigenvalues are 
i? /M and 5h?/2M and find the corresponding eigenvectors. What 
are the degeneracies in each case? 
Use time-independent perturbation theory to show that, correct 

to the first order in ¢, the lowest three energy levels are 

ee a ee 

wt an + &(@ — b*), am + ¢(2 + b*), 
where a = 1/2 and b = 16/97. 


For Ho and H’ self-adjoint operators and \€ Ra family of Hamil- 
tonians is defined by 


Hy = Hp +A". 


Show that, with % normalized, the second-order correction can be 


written as 
EY = ((H’ — E')poly’). 


A particle of mass m moves in the interval [0, a] under the influence 
of a potential V(x) = Acos(Nz/a) with N a positive integer. Given 
that the energy is approximately h?k?1?/2ma?, find the correction 
to the energy to first order in \ and show that it vanishes for all but 
one value of N. Find the second-order correction to the energy level 
in those cases where the first-order correction vanishes. 

{It may be assumed that the normalized wave functions for a particle 
moving freely in the interval {0,a] are given for k = 1, 2,... by 


ox (az) = y2sin ate 


A small additional term H’ is added to the Hamiltonian Ao. Find an 
expression for the first-order correction to the wave function when 
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¢ = H'p is an eigenfunction of Ho with energy € # Eo, and show 
that to second order the energy is 


A*Itell? 


Pe: 


By choosing Hp = P?/2m+mw?X?/2+x for suitable w and k, show 
that the ground state energy of 


2 
ae line? X27 4AX4 
2m 2 


is approximately 
1 3 MM MeN? 
a F ~ 2m (sa) | ; 


where w? — a*w — 6Ah/m? = 0. 
[The normalized wave function for the third excited state of Hp = 
P2/2m + mw*X? /2 is 


Hi c (mary ~12 (“) + | bo; 


where wo is the ground state wave function. ] 


——— ae 
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13* Iterative perturbation theory 


Even perturbation theory Is no more complicated than the forced vibrations 
of a string. 


ERWIN SCHRODINGER, letter to Willy Wien, 22 February 1926 


13.1. The Brillouin—Wigner iteration 


The Rayleigh—Schrédinger theory described in the preceding chapter is by 
no means the only approach to stationary perturbation theory, nor is it 
usually the most powerful. As one of many possible alternatives we shall 
now describe an iterative procedure for approximating stationary states, 
sometimes called Wigner-Brillouin perturbation theory, which resembles 
the Feynman—Dyson method of Section 11.3. 

It is sensible to approximate y by the closest appropriate eigenvector of 
Ho, that is by its projection onto the Ep-eigenspace. We therefore suppose 
that wo = Pow where Po is the projection operator onto the space of all 
Ho-eigenvectors with eigenvalue Ep. (Clearly we require that Foy # 0, 
otherwise we should have chosen a different energy Eo.) It will also be useful 
to introduce the complementary projection Q = 1— Po onto ker(Hp ~ Ho)“. 


Lemma 13.1.1. If is normalized so that %o = Pow is a unit vector 
then the eigenstates for H and Hp are related by the equations 


b = Yo + (Eo — Ho) *Q(H’ + Eo — E)y. 


Proof. We know that 


(H' + Ey ~ BE) = (H — Ho + Eo - E)y 
= (H — E)p + (Eo — Ho) 
= (Eo — Ho). (13.1) 


By definition (Ho — Eo) Po vanishes, and, taking its adjoint, so does Po(Ho— 
Eo). We therefore see that Po(H' + Eo — E)y = 0, and consequently that 


(H' + Eo ~- E)b = Q(H' + Ey - E)y. (13.2) 
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Since Eg is not an eigenvalue of Ho on the image of Q, the inverse of 
(Eo — Ho) is well defined there and we deduce that 


Quy = (Eo — Ho)~* Q(H! + Eo — E)y. (13.3) 
The formula for ~ now follows on substituting this into the identity 
v= (Pot Q)b = Yo + Qy. 0 


Remark 13.1.1. When Ep is non-degenerate, the formula for ~ can be 
given more explicitly by choosing an orthonormal basis {wo, ¢1, ¢2, ¢3, ---} 
consisting of eigenvectors of Ho satisfying Hod. = €a¢o for a = 1,2,.... 
Then, writing Q in terms of the orthonormal basis of eigenvectors we have 


= Yo +(E- Ho) QH' 


= bo + (E— Ho)”* $7 (belt) $e 
a=1 ee 
= Yo+ So (bel Hp) (E ~ €a)7* bas (13.4) 


a=1l 


which is reminiscent of the formula (12.48) for the first-order Rayleigh— 
Schrédinger wave function. 

The above lemma, coupled with the fact that the energy, E, is the 
expectation of the Hamiltonian, suggests the following iterative scheme for 
approximating eigenvalues and eigenvectors. 


Definition 13.1.1. Let Eo and yo be as above, and forn = 0,1,... 
define the Brillouin~Wigner approximation 


Ent = (bal dn) /IlPall? 
Yn+1 = Yo + (Eo — Ho)~*Q(H’ + Ep — Enti)n- 


It is reasonable to hope that if one has chosen Ho, Ep, and Wo wisely 
then these sequences will converge to the true values E and 7. 
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Lemma 13.1.2. The approximate wave functions satisfy the nor- 
malization condition 


(bol) = 1 


for all positive integers n. The first-order approximation to the energy 
is given by 


Ey = Ey + (bolH'po). 


Proof. Since Q commutes with Hp and jp is orthogonal to the range of 
Q, we have 


(Polbn+i) = (dolvo) + (bolQ(Eo — Ho)~! (H’ + Eo — En+1) Pn) 
= (volo) 
a4 (13.5) 


(It is similarly possible to show that Pop, = to for all n.) The definition 


also gives . 
FE, = (o|H ho) 
= (bo|Hovo) + (bol H’ bo) 
= Eqo+ (Wol|H'v0), (13.6) 
which completes the proof. , a) 


It is worth remarking that 


IItn — oll? = IIdall? — (Palo) — (Yol%n) + IIPoll? = Pall? —1. (13.7) 


13.2. Convergence of the iteration scheme 


When looking for higher order approximations it is often useful to use 
slightly different formulae. 
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Theorem 13.2.1. Let Eo be a non-degenerate energy level, and set 
nm = tn —Wn-1. Then the approximate wave function y,, satisfies 
the equation 


(Eo ~ Ho)6n = (H — En)vn-1 — (bol(H — En) Yn-1) v0 


and, in particular, (Zp — Ho)y1 = (H'+ Eo — Ey). The energy 
difference between successive approximations is given by 


(Ent1 — En) ||Pall? = (6nl ((H — En) + 2 (Eo — Ho)] bn) 


Proof. Applying Eo — Ho to the recursive definition 13.1.1, and using 
the fact that (Ho — Ho) vanishes, gives 


(Eo — Ho)n = Q(H’ + Eo — En)n-1- (13.8) 
Similarly, the first term in 
(Eo — Ho)tn-1 = Po(Eo — Ho)n—1 + Q(Eo — Ho)vn-1 (13.9) 
vanishes, so that 
(Eo — Ho)bn = Q(H’ + Eo — En)n-1 — Q(Eo — Ho)bn-1 
= Q(H — En)n-1. (13.10) 


are Q is the complement of the projection onto 7%, we have, for any ¢, 
= (1—Py)¢ = ¢—(40|¢)%0, from which the first formula follows. When 
n = 1 there is a simplification since, by definition, (bol(H — E1)yYo) = 0. 


This gives 
(Eo — Ho)¥1 = (Eo — Ho) Yo + (H — E,)¥0, (13.11) 


which reduces to the stated formula. 
To obtain the formula for the energy difference, we start with the ex- 


pression 
(En+1 ~ En) Idnll? = (bal(H — En)bn), (13.12) 


and substitute Yn = Yr—-1 + bn. Since (Wn—1](H — En)dn—1) vanishes by 
definition of F,,, this gives 


(Snl(H — En)bn) + (Pn—1l(H ~ En)bn) + (Onl(H — En)Yn-1). (13.13) 
The middle terms may also be simplified since 


(6n|(H—En)bn—1) = (Sn|(Zo—Ho)bn)+(bn}Po)(pol(H —E)pn—1), (13.14) 
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and (6,|%o) vanishes by Lemma 13.1.2. Collecting the non-vanishing terms 
we have 


(6n|(HT ~ En)bn) + 2(6n|(Eo — Ho)5n), (13.15) 
which gives the stated formula. ia 


Theorem 13.2.2. If the Brillouin-Wigner approximants E,, and wy 


converge then their limits F and w satisfy Hy = Ey. 


Proof. If we assume that E, and , converge then, taking the limit of 
Theorem 13.2.1 as n — oo, their limits E and 7 will satisfy 


(Eo — Ho)(b — $) = (H — E)b — (bo|(H — Ep), (13.16) 


so that 

(H — E)b = (bo|(H — E)p) po. _ (13.17) 
Taking the inner product with 7, and recalling the normalization of Lemma 
13.1.2, gives 

(b|(T — E)b) = (bo|(H — E)y). (13.18) 
Now the left-hand side is the limit of (Yn|(H — En41)¥n), and therefore 
vanishes, so that equation (13.17) reduces to (H — E) = 0, as required. O 


Theorem 13.2.3. Let E, and #, be the Brillouin~Wigner approx- 
imations, and assume that E and y are a true eigenvalue and eigen- 
vector. Then 


E- En+1 = (p nes n|(E ~ A)(p = ¥n))/MPall? 
 — dn41 = (Eo ~ Ho)*Q [(H’ + Eo — Ens1) (b- dn) 
+ (Basi —E)y]. 


Proof. We start by setting w, = % +6 in the formula 


(Basi re E)||¢nll? — (ba|( = E)%n), (13.19) 
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to get 
(6|(H — £)6) + (b|(H — E)6) + (6|(H ~ E)p) + (b\(H — E)p). (13.20) 


Since Hy = Ey the last three terms vanish, leaving the stated formula. 
For the wave function we may use Lemma 13.1.1 together with the re- 
cursive definition 13.1.1 to obtain 


» — n+ = (Eo — Ho) 'Q (H’ + Ey — E)y 
— (Eo — Ho)~'Q(H! + Eo — Ensi)Vn 
= (Eo — Ho)~'Q[(H! + Eo — Ensi)(¥ — dn) 


+ (Ent1 — Z)y), (13.21) 


as stated. Oo 

The first inequality shows that if the wave function converges then the 
energy converges quadratically. This is what makes this scheme rather 
faster than some of the apparently simpler alternatives. In general, it can 
be more powerful than the Rayleigh—Schrédinger theory, as well as avoiding 
the assumption that E, and w, have power series expansions. 


Theorem 13.2.4. Let ~, and E, be the Brillouin-Wigner approx- 
imations to the wave function and energy for Hy, = Ho + uH’, and 
suppose that the true energy & and wave function # depend analyt- 


ically on u. Then, under the assumptions of Theorem 13.2.3, yp, is 
accurate to order u™, and E, is accurate to order u?"—!. 


Proof. Since ~ — %o vanishes when u = 0 it is divisible by u. We can 
now prove by induction that » — wp is divisible by u™+! and E — Ey by 
u?", using 


» — bn = (Eo — Ho)7*Q [(uH’ + Eo — En) (W ~ Yn-1) + (En — E)YI, 
(13.22) 
and also 


E-E,= (wy is vn-il(Z = A)( _ Vn-1))/{Yn—all? 


from Theorem 13.2.3. This means that w, is already correct to order u® 
and E, is correct to order u?"~1, o 


(13.23) 
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13.3. The Dalgarno—Lewis method 


Impressive though the formula for 7’ in Theorem 12.4.3 looks, it is usually 
impractical to calculate an infinite series of inner products. Of course, one 
may be lucky and find that most of the terms vanish, so that the series 
collapses down to just a few terms, but an alternative approach to finding 
a first-order wave function, suggested by A. Dalgarno and J.T. Lewis, is 
often easier. 

Since the vector , should be close to wo we try Yn = (1+%n) Wo where 
@,, is an operator that remains to be determined. (It will be useful to make 
the obvious convention that po vanishes.) 

By Theorem 13.2.1 the approximate wave functions satisfy 


Q (H' + Eo — En+1) (1+ ©n)bo = (Eo — Ho)(1 + On+1) Ho 
= (1+ 6241) Hove — Ho(1 + ®n41) V0 
= [1+ Sn41, Ho}yo 
= [€n+1, Holo. 


We have thus proved the following result: 


(13.24) 


Theorem 13.3.1. The sequence of vectors Jn = (1+ Bn) defined 
by 9 = 0 and 


[Pni1, HolWo = Q(H' + Eo ~ Enzi) (1+ Gn) vo 


satisfies the Brillouin—-Wigner equations for the approximate eigen- 
state. In particular, the first-order approximation 7, = (1+ ©1)yo 
can be obtained by solving the equation 


(H' + Eo — F1) bo = [©1, Ho}vo. 


At this point it is useful to recall that ~ and wo are wave functions and 
that a plausible form for the operator ©, would be multiplication by a 
function ¢,, that is 


(®nvbo) (x) = on(x)ho(x). (13.25) 
Moreover, in practice we usually start with 
Rh? 
Hy = -x—V°+V. (13.26) 
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Then multiplication by V commutes with multiplication by ¢, so that 


hi 
[Ho, Gn) Yo ran oat bn) po 


2 
= - [V?(tnvo) — bn V7Wo] 
~ -- [(V?dn)¥0 + 2V4n. Vo]. (13.27) 


Thus we arrive at a differential equation for dn4.1: 
2 
ce [(V7dn41) Yo + 2Vbn41-Vbo] = (H’ + Eo — Enti) (1+ on) ¥o- 
(13.28) 
This can be further refined into the following theorem: 


Theorem 13.3.2; When Ho= —h? V2 /am+ V and H’ isa multipli- 
cation operator, ®,, can be expressed as multiplication by the solution 
gn of the differential equation 


2 
~aiv(¥3 grad ¢n41) = (H’ + Eo — En4+1) (1 + bn) v2. 
m 


Proof. We multiply equation (13.28) by Wo and simplify to get the gs 
result. 


Corollary 13.3.3. In one dimension, when Hp = P?/2m+V and 
#7’ is a multiplication operator then ©, can be expressed as multipli- 
cation by the function 


bna(e) = Se | vo? | (H+ Bo ~ Enas)(1+ 6n)¥8. 


Proof. The one-dimensional version of the theorem gives ¢n41 as the 
solution of the differential equation 


(8 baa)! = (HY + Bo ~ Bass) (1+ bn)¥ (13.29) 
Mm 


which can be integrated to give the result. Oo 
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13.4. Example 


As an example of these procedures we consider a one-dimensional oscillator 
in a uniform field F so that the potential is V = dmuw?z? + Fa. This can 
be solved directly by noting that V = gw? (x + F/ mw)? ~ F2/2my?, 
By changing the origin we see that the energy levels are just (nm + A)iw ~ 
F? /2mw?. For the purposes of illustration, however, we shall regard the 
term H’ = Fx as a perturbation of the usual oscillator Hamiltonian, Ho, 


and investigate what happens to the ground state. The first approximation 
to the energy is given by 


Ei = (o|Hyo) 
= (bol Hovo) + F(bolX vo) 


= thw +P i xlbo(2)|? de. (13.30) 


Since the integrand in the last term is odd the integral vanishes, showing 


that £, = Ho. Applying the Dalgarno~Lewis method to find the first-order 
wave function leads us to the equation 


2 
7 (W801)! =(H' + By Bi) ve = Fev? (13.31) 


Substituting %o(r) = N exp(—mwa?/2h) and simplifying, we obtain 


he go = a, (13.32) 
which has the obvious solution ¢1 = —(F/fw)z. The first-order wave 


function can therefore be written in terms of the ground and first excited 
state wave functions yp and y as 


v1 = (1+ g1) vo 
Z F 
= Yo - Fp tvo 


FLA 
= Yo - iw V Imo?!" (13.33) 


‘To evaluate the second Wigner—Brillouin approximation to the energy using 
Theorem 13.2.1 we need 


F h 
61 = 1 ~ Yo = “sa ramet (13.34) 
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Since FE, = Ep and H — Ho = FX, this means that 


(gil [FX + (Eo — Ho)] 91). 

(13.35) 
Now (~1|F-Xy1) vanishes, because the integrand is odd, and (Hy— Ep) y1 = 
huxp1, so this expression reduces to give 


F? 
(6: ((H — B) + 2(Eo — Ho)| 51) = = 


F?2 
(Ee = E;) (Ipa ||? = — Shs (13.36) 


Combining this with |||]? = 1+ F?/mhw>, we obtain 


F? \") Fe 
B= tro ~ (14 5) mis 
: 2 aa | 
at fay (+=) (13.37) 
™m 


This agrees with the exact solution up to terms in F°?. 


13.5. The Born approximation 


In Chapter 5 we investigated what happens when a particle moving in one 
dimension encounters a potential barrier. The freedom of real particles to 
move in three dimensions means that their behaviour os be much more 
complicated. Consider what happens in the presence of a ‘short range mel 
for which the potential tends to zero at large distances. Ideally each partic e 
starts so far from the region of interaction that the potential is virtually 
zero and the particle effectively free. It is fired with known energy towards 
the area where the potential is large and emerges from its encounter in . 
new direction, eventually arriving once more into the ‘asymptotic region 
where the potential is nearly zero and particles move almost freely. 

This suggests that, as in perturbation theory, one must compare mo- 
tion under two different Hamiltonians. In the idealized asymptotic region 
where the particle starts and finishes its journey, it moves freely with a 
Hamiltonian Hp = —h?V?/2m. During the encounter the Hamiltonian is 


h2 
H =-~—V’+V, (13.38) 
2m 


so that in this case the whole of the potential energy is regarded as a 
perturbation. This time we are not interested in calculating the energy E 
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of the particle, since-that is under the control of the experimenter and can 
be assumed to be known. We do, however, want to know the form of the 
scattered wave function, and for this the Brillouin—Wigner formula, 


= Yo + (E — Hy)"'Q(H — Ho)y 
= Yo + (E — Ho) QVy, (13.39) 
is invaluable. Here we interpret %o as the wave function appropriate to the 
incoming particle as it starts on its journey. 

In these scattering experiments the particle has positive energy, for it 
must be able to escape from the potential. We shall therefore write E — 
7k? /2m. We take %o = Aexp(ik.r) as the wave function appropriate to 
the incident particle. Recalling that Hy = —h?V2/2m, we see that the 
main problem is to invert (k? + V?). In effect we must solve 


(V? +k?) f(r) = p(r). (13.40) 
If k = 0 then this is the Poisson equation, and there is a generalization 


of the Poisson integral formula that solves this problem for functions p that 
decay fast enough at oo: 


ik|r—r’| 
f(r) = -— in ory) ax. (13.41) 


This can be proved by the same methods as the Poisson formula once one 
has noticed that g(r) = exp(ikr)/r is a rotationally invariant solution of 


the equation (V?+k?)g = 0. (Since g is spherically symmetric the equation 
reduces to 
0?(rg) 


a, + k*(rg) = 0, (13.42) 
from which it is clear that rg = exp(tkr) is a solution.) 
The projection Q is not needed in this case since it serves only to project 


onto the subspace where the integral formula makes sense. The Brillouin— 
Wigner formula therefore becomes the explicit: identity 


etklr—r’| 


2 
v(r) = Actes — 2 is Bree we er’ (13.43) 


Definition 13.5.1. The Born approrimation consists of taking the 


successive Brillouin—Wigner approximations to the solution of this 
problem: 


2m efFlr—r"| 
=> tke t 4\ 43¢ 
n(x) = Ae ra i ee, V(r") n-1(2’) dr’, 
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Example 13.5.1. For example, the first-order Born approximation is 


tk|r—r’| 
ik.r m CV tp ') dy! 
vale) = Aalee — Sy fa ee es 


Ae'*? — mA ee ete x) 
OC Ont? Ips fr 


es whl -ik. heal : 
tpn I ae RST #y') Ae**-¥ (13.44) 
Qh? R3 \r - r'| 


We would expect this to be a reasonable approximation provided that 


y . a ‘ ~ik, —_ ‘) 
a i ae) @r'| <1, (13.45) 
ark” jJRs r-r 


; Now 
When the particle has again escaped to large distances r we neve \r—r’| 
r—wu.r’, where u=r/r, and so, dropping terms of order r~*, 


mA ee auc yen hes 
i(t) ~ dolr) — 53 Pred Vir’) d°r 
= o(r) — Rice et(k-ku)-r’ (pt) 3p’, (13.46) 


Oke oT R3 


This is the sum of the incident wave yo and a scattered wave, which is 
a multiple of exp(ikr)/r. For a given direction u = r/r the scattering is 
governed by the coefficient 
: 
ae Te i(k—ku).r' yy 1) By. (13.47) 
B(u) = —z e (r’) d°r 
(u) ark? Ips 


The current density of the scattered wave (Definition 2.5.1) is 


nj ABl? etkr iket*r etkr etkr ~ike-thr = —)| se 
jo= mi r ( r -F)-5 r r2 
_ ARIAPIB(u)? (13.48) 
a mr? ; 


and its flux through an element r?dQ of the surface of a sphere of radius r 
is therefore 
Akl AY" B(u)|2d0. (13.49) 
m 
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This is just |B|?dQ times the magnitude of the current density of the incom- 
ing wave, jo = (h|A|?/m)k, and leads us to make the following definition; 


Definition 13.5.2. The differential 


do = |B(u)|240 


is called the differential scattering cross-section. 


Exercises 


13.1 A harmonic oscillator with Hamiltonian 


ee Ree 
Ho = 5 + 5m? X 


is perturbed by the addition of the term H’ — AmX?/2. Show that 
the first-order correction to the ground state energy is \i/4w and 
that the first-order correction to the wave function is a multiple of 
the second excited state wave function. 

Calculate the second-order energy approximation in Rayleigh~ 
Schrédinger and in Brillouin-Wigner perturbation theory, and com- 
pare your answers to the exact ground state energy. 


13.2° A hydrogen atom in its ground state is acted on by a uniform weak 
electric field F of magnitude F. Assume that the Hamiltonian of 
the system has an eigenvector w that, referred to spherical polar 
coordinates (r,6,¢) with Oz in the direction of F, is given by 


V(r, 8,0) = bo(r)[1 — FeosOR(r)] 


where Yo is the ground state eigenvector of the hydrogen atom Hamil- 
tonian. Prove that R(r) satisfies the differential equation 


or +2(2 3) = 2R 2r 


dr2 roa 


dr r2 ea 


where a = h? /me?, m is the electron mass, and —e is the electron 


charge. Verify that 
atr Leyry2 
se aera [E+5 (2) | 


are = 
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is a solution satisfying the necessary boundary conditions. Calculate 14 ~~ -Variational methods 


the energy of the atom in the state described by 7 to second order in Se es 


F, and deduce that the polarizability of the atom, for weak electric 


fields, is 9a9/2. 'am convinced that the spectrum of all chemical elements can be obtained 
13.3° Show that the first Born approximation to the total cross-section ‘++ from quantum theory in a unique manner without physics by bone- 
for particles of momentum fik scattered by the Yukawa potential, headed calculation. 
V = Cexp(—r/a)/r, is WERNER HEISENBERG, letter to Pascual Jordan, 28 July 1926 
4m?C7a4 
fd [1 + 4k2a? sin?(9/2)}” 14.1. Rayleigh quotients 
By taking the limit as a —> co, deduce Rutherford’s formula for the 
differential scattering cross-section due to a Coulomb potential, C/r, Variational methods exploit an alternative characterization of eigenvectors 
m2C2 : and eigenvalues which lends itself better to approximation schemes. This is 
Taps re (6/2). motivated by the geometrical fact that the points on a quadric, x.A.x = 1, 


that are closest to or furthest from the centre lie on the principal axes, and 


13.4° Show that the first Born approximation to the differential cross- these point along eigenvectors of the matrix, A, defining the quadric, The 


section for the scattering of particles of momentum fik by the spher- inverse square of the distance {x|-?, which can be rewritten as x.A.x /|x|?, 
ically symmetric potential is also stationary for such vectors. This suggests that we should consider 
V =Oe-"/2" how the expectation, Ey(H), of a self-adjoint operator H depends on the 

; non-zero vector . To emphasize the role of w let us introduce the function 


is proportional to exp [~2k4a? sin?(0/2)] 
where @ is the scattering angle. 
13.5° Derive the first Born approximation to the differential cross-section 
for the potential V = Cr? exp(—r?/a?). Definition 14.1.1. The function fiy(¥) is called the Perea 
13.6° Show that the first Born approximation to the differential cross- tient. 
section for the potential 
hA/am' r<a 
Li es y r>a 


f(b) = Ey(H) = EY) (14.1) 


ll? 


a Theorem 14.1.1. The function fu(?) is stationary with respect to 
the addition of vectors in a subspace K if and only if [H ~ fu(v)]¥ 
is orthogonal to K. The stationary values of fy for all changes in 
w occur when w is an eigenvector of H , and they are the associated 
eigenvalues. 


a2 2 
xo (sinKa — Kacos Ka)’, 


where K = 2ksin 36 with 6 the scattering angle. 
13.7 Assume that ~ decays at least as fast as R-? for large R. By applying 
Green’s formula 


[ (ov*6-ev?wav = | ve evwyas 
D aD 


: Proof. Let us suppose that w gives a stationary value of fy. Choose 
—r'| < R, with ni 
to the volume D where € < lf . | is a vector ¢ € K and consider the family {pu = p+ ud} for u € [0,1], for 
b= Flr ip — rv, which 
and letting « + 0 and R — oo, prove formula (13.41). fal + ud) lp + udll? = (b + ud|H(p + ud). (14.2) 
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Expanding both sides we get 


fab + ud) [Ivll? + ul) + (Id)) + wI9 7] 
= (b| Hv) + u((d|H) + (W|H¢)) + u(¢|Hd), (14.3) 


which may be differentiated at u = 0 to obtain 


Bal) Wy? + fra(w) (Olv) + (Id)) = (LEEW) + WIHT). (14.4) 


Rearranging terms, we have 


Sa) yy? = ((GlEW) + (HVId)) — fur(W) ((Gl) + WI8)) 
= (| [H — fu(¥)] ¥) + (A — fu(¥)] v9). (14.5) 
We choose ¢ to be the projection of [H — fx())|¥ into K, so that we may 
write [H — fa(b)| vy = ¢+ x with x € K+. Then 


d 
Ga) yy? = (416 +x) + (6 + xid) 
= 2I/6I)?, (14.6) 
from which we deduce that ¢ must vanish for a stationary value. This 


means that [H — fu(¥)] ~ = x is orthogonal to K, as asserted. When we 
allow arbitrary variations, K is the whole space and K = {0}, so that 


He = fulv)d, (14.7) 
from which the second result follows immediately. Qo 
Remark 14.1.1. For most interesting physical systems the function fy (7) 


is bounded below, corresponding to the physical fact that there is a ground 
state of the least possible energy. 


Theorem 14.1.2. If fx() is bounded below and achieves its lower 
bound then miny fr(7) is the ground state energy of H, and any w 


for which it is attained is a ground state wave function. Conversely 
on a ground state wave function fy attains its lower bound. 
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Proof. By the preceding result the minimum value of a function is auto- 
ep oad stationary, so the corresponding vector ~ must be an eigenvector 
of H, and fr() is an eigenvalue. Being the lower bound frr() must 


be the least eigenval i 
fthediaks genvalue, that is the ground state energy. The converse is 
Q 


sree 14.1.2, Since fu(p/lvl]) = fu(~), it makes no difference 
Ww ether we take a minimum over all vectors or just over normalized vectors 
If ~ is not allowed to Tange over all vectors, but only over some preselected 
subset, then we shall probably not be able to attain the true minimum of 
fu(w), but we shall certainly get an upper bound on the eigenvalue. 


The virial theorem 14.1.3. Let H = T + V(r), where T = 
—h*V?/2m, and suppose that H w = Ey. Then 7 
(i) 2Ey(T) = Ey(X.VV). 
(ii) If V is homogeneous of degree N then 

N 


E,(T) = Wao! and 


2 
Ey(V) = Woaoe: 


Proof. Suppose that v is a normalized ej i 
igenfunction of H. Consider the 
normalized trial wave functions y,, (r) = exp(3u/2)y (exp(u)r). Applying 


the chain rule 
(Vou) (r) = “el (Veb)(e"r), (14.8) 
we see that (Py, = P i i 
of variable in ee eae a 
ez 
ula) = 5 [TER PPy) (etn ear 4 [V@iv(ernPe as 
= IT, ¥) + [ V(e~2) We) Pear 
= Ey (T) +Ey(V (e““X)). (14.9) 


Since Wp is a true eigenvector f i 1 
: (tu) must have a stationary val a 
which gives the condition that rear o 


2eEy, (T) + dEy (V (e“"X)) /du (14.10) 
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must vanish at u = 0. Applying the chain rule to calculate the derivative 
of V, this yields the virial theorem 


2Ey(T) = Ey((X.VV)). (14.11) 


If V is homogeneous of degree N, then Euler’s theorem gives r.VV = 
NV, so that we have 
2Ey(T) = NE,(V), (14.12) 


which combined with the obvious relation, Ey(T)+Ey(V) = E, gives (ii). 0 


Remark 14.1.3. Exercise 7.7 outlined a more elementary derivation of this 
result in one dimension, which can easily be extended to the general case, 
and also suggested useful applications of the result to calculate dispersions 
for the harmonic oscillator, and to show that the bound state energies of 
the hydrogen atom must be negative. 


14.2. The ground state of helium 


The Hamiltonian for the helium atom, which has already been discussed 
in Section 12.3, is 


Rp ens. CE FA e 
ee Feit GD gs) (Pr en OOM 9 
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Proposition 14.2.1. The ground state energy of the helium atom 


is less than or equal to 
27\" 
16) 47e 9a’ 


Proof. Physical intuition might suggest that the presence of the second 
electron partly shields the positive charge on the nucleus. We therefore 
select the trial functions 


vz (t1,r2) = (4) exp (-2 (r1 +r2)) , (14.14) 
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which are the wave functions appropriate to a nuclear charge Z, and mini- 
mize (~z|Hwz)/||bz||? to find the best value of Z , and so to estimate the 
ground state energy. 


Now, as in Section 12.3, we know that wz is the ground state of 


as Ze? (1 1 
Hz = -— (vV34+V3)-22 (141) 7,2 
2m ( 1 2) Ante Ty Ey T2 sa 2 “a (14.18) 
with ground state energy, Ez = —Ze?/4mega. According to the remark 
following the virial theorem 14.1.3 we have 
gg 2 
Ey. (T) =—-Ez= < ’ 
arcoe (14.16) 
ZV 2@2 : 
Ea. (+) = 28, = _ 2Z7e 
2 4reja 
or 
Ze? 
Ey, (V) = —-—~. 
va(V) =~ (14.17) 


Finally the electronic repulsion (wz| (e?/4meo|r1 — r2|) z) was shown in 
Section 12.3 to be 5Ze? /32m€9a. Combining the various pieces we have 
Ze? Ze? Ze? 

4nega - Tega * 327€E9a 

Ze? 27 Ze? 

47 €0a e 327E9a 


= ian (2 = a = (F) d (14.18) 


When Z = 27/16 this achieves its minimum of 


(bz|Hyz) = 


“ (i) “= one 
16) 4rea aoa (14.19) 


which differs by less than 2% from the experimental value of ~-0.73e?/mega 
and is a considerable improvement on the —0.6875e2 /weoa of first-order 
perturbation theory. The wave function is that for a nuclear charge 27e/16 
suggesting that the actual charge of 2e has been reduced by 5e/16 Going 
to the screening effects of the other electron. D 


It is, of course, no accident that the variational method has produced a 
more accurate result than first-order perturbation theory. It came about 
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because the family of trial functions included the true ground state wave 
function of the unperturbed Hamiltonian. 


Proposition 14.2.2. If the family of trial vectors {7} includes the 
ground state wo for the unperturbed Hamiltonian Hp then 


(lew) 
jure =” 


where E; denotes the first-order perturbation theoretic estimate of 
the ground state energy, and E denotes the true ground state energy. 


Ey > min 
w 
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Hilbert’s minimax theorem provides a more sophisticated formulation of 
the same result. : 


Theorem 14.3.2. If the infimum, 
inf {max {fy() : bp € K}: dim(K) =k}, 


over all k-dimensional subspaces K of #1 is attained then it is the 


k-th lowest energy level of H and the vector p is the corresponding 
eigenvector. 


Proof. We already know that the energy E gives a lower bound for the 
Rayleigh quotients. In perturbation theory we had for normalized wo: 


E, = (o|H yo), (14.20) 


so clearly the minimum over y is less than or equal to Fy. Oo 


14.3. Excited states 


The excited states also satisfy extremal properties, which can be investi- 
gated inductively. Suppose that we have already found the k lowest lying 
energy levels for H, k > 1, and the eigenvectors span a subspace Hx. 


Theorem 14.3.1. If inf {fy(v):~ € Hf} is attained for some ~ 
then this is an eigenvector corresponding to the (k + 1)-th lowest 


energy level, and the infimum is the energy. 


Proof. The subspace 7; is invariant under H’ and so therefore is Ht. 
(For » € He, d € He, (Hv|d) = (v|H¢) vanishes.) Applying Theorem 
14.1.1 to H|z,, we see that if inf {fy(p):~ € Hz} is attained then it is 
the lowest energy level in HE. Since the k lowest energy levels were in Hy 
this is the (k + 1)-th lowest energy overall. im 


Proof. We leave this reformulation to the reader. o 


14.4. The Rayleigh—Ritz variational theory 


Often it is convenient to take not just a set of trial functions but the space 
that they span. Let us therefore suppose that the trial functions lie in a 
finite-dimensional subspace K. Rayleigh-Ritz theory uses the variational 
principle to reduce the general problem to the finite-dimensional case. 


Theorem 14.4.1. Let {vj} denote a basis of the finite-dimensional 
subspace, K. Then the k-th lowest root E of the equation 


det ((vj|Hug) — E(v;|uz)) = 0 


ee an upper bound to the k-th lowest eigenvalue of H for k < 
im K. 7 


Proof. From the first part of Theorem 14.1.1 we know that fu(?) is 
stationary with respect to the addition of vectors in K when [H — fy (p)}p 
is orthogonal to K. Writing E = fr(), we see that 


(v;|(H — Ep) = 0, (14.21) 
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for all j. If w is itself in K then it can be expanded as w = >> cxUp, and 


then we have 


0 = S-(v;|(H — B)ur)cx = > ((vj|H vn) — E(vjlue))ex- (14.22) 
k k 


The condition for these linear equations to have a non-trivial solution for 
the coefficients cz is 


det ((v;|Hux) — E(vj|v~)) = 0, (14.23) 


and the assertion of the theorem follows. oO 


Definition 14.4.1. The matrix ((v;|vz)) is called the Gramian or 
overlap matrix. The equation 


det ((vj|Hvx) ~ B(vjlve)) = 0 


is called the secular equation. 


For an orthonormal basis the overlap matrix is the identity, and the 
secular equation reduces to the characteristic equation. Many of the most 
accurate calculations of atomic ground state energies have used variational 
methods. As an illustration, in 1994, S.P. Goldman presented a simple 
calculation of the ground state energy of helium with a relative error of 3 
parts in 10°, by using a Rayleigh-Ritz method with 393 basis functions. 
These were written in terms of ry = max(ri,re) and rm = min(r1,72) as 
well as r; and re, themselves, and each was the sum of a term of the form 

et eara Aiea Serine Nig Dag gle hk (14.24) 


with the corresponding term with the particle positions interchanged, and 
each A a carefully chosen function of the angular terms alone. 


14.5. Historical notes 


Variational methods have been used for various problems since the early 
nineteenth century, and were used in particular by William Strutt, later 
Lord Rayleigh, in his investigations of wave motion. Following the pioneer- 
ing work of Ivar Fredholm on integral equations in 1902, Hilbert started 
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the rigorous mathematical study of variational procedures. One of his stu- 
dents, Walther Ritz, who worked on the problem shortly before his death 
from tuberculosis in 1909, independently rediscovered parts of Rayleigh’s 
work. 

One instructive use of variational methods in wave theory can be il- 
lustrated by considering the equation for stationary waves on a string, 
Yer = —wc*y, which can be written in quantum mechanical notation as 
Py = Arwc?y, The fundamental frequency of this string is / E/hic, where 
E is the lowest eigenvalue of P?, or equivalently the minimum value of 
(y|P?y). If a finger stops the string, some previously admissible waves will 
be suppressed and this may raise the minimum, causing the frequency to 
rise. This provides a simple qualitative understanding of why open strings 
on a violin or guitar give lower notes than stopped strings. 


Exercises 


14.1° Estimate the ground state energy of the harmonic oscillator by taking 
trial functions of the form (x) = (a + bx) exp(—4cx?), where a, b, 
and c are real and c is positive. 


14.2° A harmonic oscillator has Hamiltonian 


P24 
H = — + mw? X?. 
Im + 97 
Show that the expectation value of H in the state described by the 
wave function x” exp(—4cx?) (for n a non-negative integer and c a 
positive real number) is 


2 1 2 
yy mw" | (n-4\ hi'c 
(n+ $) Bc + (4-4) oa 


By varying c obtain upper bounds on the ground state energy of the 
harmonic oscillator. What estimate of this energy does one obtain if 
one allows 7 to be non-integral and varies n as well as c? 


14.3° Calculate a bound for the ground state energy of a hydrogen atom in 
a weak uniform magnetic field of strength F along the z-axis, using 
trial functions of the form #(r) = (1 + Az)Wo(r), with wo being the 
normalized ground state wave function of the hydrogen atom in the 
absence of the field. 
[You may use the identities 


2,243. — 92 z* og 
z°pod’r = a’, —Yod"r = |] 
R3 R37 2 
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14,4° The total angular momentum operator in spherical polar coordinates 
has the form 


Se POP lee Ge: a 
L? = 7 oaap [anos (sino) + ag | 


Using 1 and cos? @ as a basis for a space of trial wave functions on 
the unit sphere with inner product 


(bale) = ri Piva sin 6 dod¢, 


obtain Rayleigh—Ritz estimates for the lowest eigenvalues of L?. 


14.5° The Hamiltonian for a particle of mass M moving in one dimension 
is given by 


where a and & are positive constants. For a positive, find the ex- 
pectation value of H with respect to the state vector # defined 
by 1,23 

Yo(z) =e72**. 


Show that the best upper bound that can be put on the ground state 
energy using trial functions of this form occurs when a = d¢ is the 
positive root of 

a? (Kk? + a)8 = an? 


+o p2 

14.6° A particle of mass m moves in the central potential h“U(r)/2m, 
where Pg 

U(r) = Po) ro 7 
and a and 6 are real constants. Find the expectation value of the 
Hamiltonian when the wave function of the particle is given by ¢ = 
exp(—«r) and show that the ground state energy is bounded above 

~ -h? a 
2m 4(1 + 26?) 


A particle of mass m moves in the Coulomb potential 


Ap 
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Using the trial wave function vp = exp(—Gr) (@ > 0), obtain the 
best upper bound to the ground state energy. , 


14.7° A particle of mass m moves along the positive x-axis under the in- 
fluence of a potential V that is infinite for negative z and Ez for 
Positive x. By calculating the expectation of H for trial functions of 
the form $(z) = 3° exp(—xz), as a function of « and c, show that 
the ground state energy is less than 


Q7E2R2\ F 
( 4m ) : 


ou may assume that [°° 2” exp(—Azx) dz = ni\—™+1), 
ft) 


14.8 A particle of mass m moves along the x-axis under the influence of 
a potential 


_1 2.2 k(n? — 1) 23,4 
Vis mw +E we", 


where & > 1. Show that for trial wave functions of the form ¥_(x) = 
N exp(—amwa?/2h) 


23 
Fle) =F (ator MES), 


Show that this is minimum when a = «, and hence that the ground 
state energy is less than or equal to (3«? + 1)hw/8x. 


14.9° A particle of mass m moves in a potential 


~hPC ent 
V(r) = a 


a r>Q0, 


where « and C are positive constants. For @ > 0, find an expression 
for F() where 


Ya(r) =e /?, or 5c, 


Show that stationary values of F occur when 
(a + «)? = 2Ca(a + 3K). 


Show further that when C = 2, the best estimate for the ground 
state energy occurs when a = 2k, and find it. 


15 The semi-classical approximation 


Just now | am teaching the foundations of poor deceased mechanics, which 
is so beautiful. What will her successor look tike? With that question | 
torment myself incessantly. 

ALBERT EINSTEIN, letter to Helnrich Zangger, 14 November 1911 


15.1. The semi-classical approximation 


There is another very useful sort of approximation which compares a quan- 
tum system not with a simpler quantum system, but with the corresponding 
classical system. We start by putting the wave function into polar form 


wp = aexp(iS/h). 


Theorem 15.1.1. The wave function wy = aexp(iS/h) satisfies 
Schrédinger’s equation 


if and only if a and S satisfy the coupled equations 


és |VS? ,,_ W Va 
Go aa! one 
O(a?) 

at 


, 


2 
+ div (vs) =0. 
m f 


Proof. We immediately calculate that 


Va, i dp _ (180 7) 15.1 
and 
Va {Val? = Va i ).(% a ) 
PY rat (pedantic a .{—+-+V8S 
vp=(~ TRY 8) et ae ae as ¥ 
Va, ics PVG.og . 1 *) 15.2 
5 (Tt+iv 5 +25 -2.VS ~ SIVSp ) v. (15.2) 
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Schrédinger’s equation now reads 


2 
= |= (ws +inv?s + uh? Vg — Ivsi) +f v| wb, 
2m a a 
(15.3) 


and since we are supposing that both a and S are real we may separate the 
real and imaginary multiples of » to obtain the two equations 


as |Vs|? h? Va 
a tm + oma” vee 
and loa 1 1V 
oF 6 Sig A VE ee 
ot omV 8+ HVS =0. (15.5) 


On multiplying through by 2a? the second of these can be rewritten as 


a”). a? 
— rt —_— = 0, 0 
OL + div ( me vs) 0 


Remark 15.1.1. One may readily check that when w = aexp(iS/h), the 
probability density is p = a? and the probability current is j = a*VS/m, 
so that the second condition is just the continuity equation of Proposition 
2.5.1. 


So far everything has been exact, but if we look at the first condition 
we see that the right-hand side is of order h?, whilst the left-hand side is 
independent of h. This suggests the following approximation: 


Definition 15.1.1. The semi-classical approximation uses the wave 
function ~ = aexp(iS/h) where a and S$ satisfy the following coupled 
equations: 


os |Vs|? 
Or + on; +V=0, 
8(a?) 


_ (a 
BE + div (Svs) =0. 


The semi-classical approximation is also known as the WKB approxima- 
tion after Wentzel, Kramers, and Brillouin who introduced it into quantum 
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theory. Sometimes the initial J is added, for Jeffreys, to commemorate an 
earlier pioneer of the method. In fact, a closely related technique for solv- 
ing differential equations was introduced as early as 1837 by Liouville and 
independently by Green. It is really part of an asymptotic approximation 
to the wave function for very small values of &. One could go further and 


find a series, 
v= esl (« 5. ont") (15.6) 


n=l] 


which is asymptotic in the sense that 


n-N ony - (« + 3 at")| (15.7) 


n=) 


tends to 0 as & — 0. (Even though the series 5° a,” may be divergent for 
non-zero h, the asymptotic approximation, which at any stage uses only a 
finite number of terms, can nonetheless be very accurate.) a 

The particular advantage of the semi-classical approximation is that the 
first equation involves only S, and in fact it is the Hamilton-J acobi equation 
of classical] mechanics. It is the equation satisfied by the classical action 


S= [ve (15.8) 


where LD is the Lagrangian for the classical system, and the integral is taken 
from some fixed starting point, y, to x, and calculated for a journey taking 
time t. The continuity equation is also classical in form (being the equation 
for a classical fluid), so that we have approximated quantum theory by a 
i heory. ! . 
gece between Schrédinger’s equation and Hamilton-Jacobi 
theory has inspired some attempts to reinterpret quantum theory using 
classical hidden variables. David Bohm and Imre Fényes independently 
introduced two such examples in 1952. Both relied on the observation that 
the true equation for S can be written in the form 
2 y2 
Pe WEE sap e G (15.9) 


Ot 2m zs 2m a , 


which is the Hamilton-Jacobi equation for a potential V — WV?a/2ma. 
This suggests that one is dealing with classical mechanics with an additional 
potential, —h?V2a/2ma, which depends on the probability density, a= 
/p. Bohm’s example postulated this, together with some rules designed to 
remove other counterintuitive features of the new potential. Fényes’ theory 


feds 


SEMI-CLASSICAL EXAMPLES 249 


is based on probabilistic ideas. In a later refinement, due to Edward Nelson, 
the new potential appears to be the exact analogue of a termi which appears 
when one considers the classical Brownian motion of a pollen grain, buffeted 
by surrounding molecules. 

However, despite their obvious appeal, such ideas do not really return us 
to a familiar kind of classical theory. As we have seen in Section 10.4, the 
new theory cannot be local, that is there must be long range interdepen- 
dence. The equations we have given are non-relativistic equations, and it is 
more difficult to find convincing relativistic analogues. Moreover, there are 
other, more subtle, peculiarities, which mean that one has simply swapped 
one set of problems for another, and it is a matter of personal judgement 
as to which difficulties one prefers. At a purely practical level, it is often 
easier to solve the linear Schrédinger equation than the coupled non-linear 
equations for S and a. 


15.2. Semi-classical examples 


We noted above that the argument, S, of the semi-classical wave function 
can be calculated directly from solutions of the classical equations of mo- 
tion, and a from the continuity equation. As a matter of fact the situation 
is even better, since solutions for a can be calculated directly from S, as 
the following result shows. 


Theorem 15.2.1. (Van Vleck) Let S(x,y) be a solution of the 
Hamilton-Jacobi equation where y denotes the position of the start- 
ing point for the action (or constants of integration, if one simply 
solves the equation). Then 


as 
2. 
a* = det ( Ba,8yn x) 


gives a solution of the continuity equation. 


Proof. We shall only deal with the case of one spatial dimension leaving 
the general case to the reader. Then 


2. as 


a= dxdy’ (15.10) 
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so that 
O(a?) = as 
ot —- xy dt 


a {1 /as\? 
= ey E (=) + 
__9 (195 Ss 
~~ @x \m Ox Oxdy 


1a (a, 
a ee 15.11 
m 0x & ) ; ( ) 


which is the continuity condition. (in fact this result still holds for any 
parameters ¥1,y2,y3, and not just the starting coordinates.) o 


For.a free particle there is a solution of the Hamilton~Jacobi equation 
of the form S = m!x — y|?/2t. (To see this we note that 


VS=m(x-y)/t and 08S/dt=~m|x-— y|?/2¢?, (15.12) 
from which it easily follows that 


2 
OF gp AVON 


15.13 
at eee ( ) 


so that S does satisfy the equation.) The corresponding probability density 
is 


as —m m8 
2_, fees: Wi —1)=-(—). 15.14 
a det ( in; Sn) det ( r 1) ( ; ) ( ) 


The semi-classical (unnormalized) wave function can therefore be written 
as 


v=(%) eon (x = yi?) (15.15) 


We can now adapt some of the other results of classical Hamiltonian 
mechanics to our present needs. 
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Proposition 15.2.2. The Hamilton-Jacobi equation has solutions 
of the form 


S(x,t) = W(x) — Et 


provided that . 
dsl +V=E. 
2m 


The corresponding semi-classical wave function 


is a solution of Schrédinger’s time-independent equation with energy 
EB. 


Proof. This follows immediately on substituting S = W — Et into the 
previous equations, and then calculating ihOw/dt. QO 


Corollary 15.2.3. In one dimension the Hamilton-Jacobi equation 
has solutions of the form 


S(z,t)h=+ / V2m|E — V(x)\dz — Et, 


and the continuity equation has solutions of the form 


a(x) = A[E~V(2)|~#, 


where A is a constant. 


Proof. In one dimension the equation for W becomes 
A (WY vay ak (15.16) 
am \ dz eee ; 

which can be rearranged and integrated to give the stated formula for S. 

The amplitude can be found directly from the continuity equation, which 
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in one dimension becomes 


a a? aw = 0. (15.17) 
dz \m dz 
Clearly the solution of this gives 
dw \~* B 
Ps aS a (15.18) 
Te ( dz ) J/2mlE — V(z)] 


taking square roots and 
for some constant B. The result now follows on 

aeaane constants. (One can also deduce the formula for a from es 
Vleck’s solution, as in the following example.) 


Example 15.2.1. (The free particle in one dimension) For the free 
particle V = 0 so one has simply 


S=4V2mEx - Et. (15.19) 


This is very different in form from the solution m(z — y)?/2t we sires 
expect from the three-dimensional case. It is easy to check that both are 
solutions of the Hamilton-Jacobi equation, whose solutions can take many 
different forms. This time & plays the role of the constant of integration, 


so that ‘ 
go 2 ee +./m/2E, (15.20) 
Ox0E 
and w is just a constant multiple of the plane wave 
exp|i(+v2mBz — Bt) /n| (15.21) 


5.2.1. We expect the semi-classical approximation to be useful 
renter that we ae dropped, h?V2a/2ma, is small with respect 
to the terms that we retained. The form of a in one dimension suggests 
that this will be reasonable provided that EF — V is not too small, that is 
we stay away from the classical turning points. 


15.3. The Bohr—-Sommerfeld condition 


Owing to the appearance of the square root in the formula for Sin Corollary 
15.2.3 there are two independent semi-classical solutions, which we write 


“ Ax[E — V(2)|-} exp & ‘| * ime = Val dz) (15.22) 


_ 
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where c is some arbitrary basepoint. We shall now show that there can 
be other constraints that force us to use a particular linear combination of 
these two approximate solutions. 

The square root makes good sense as long as one is dealing with the 
classical region where E > V. In the non-classical region, where BE < 
V(z), it is natural to make the complex substitution FE ~ V(x) = |E — 
V(z)| exp(+iz), so that our two semi-classical solutions can then be written 
as 


a3 aus - 
AseP TLE —V(e) 4 exp (siet*# [” /amiB= VGH dx/n). (15.23) 
6 


It is easy to check directly that these are valid approximate solutions. To 
pass from the classical to the non-classical regions one must traverse a 
turning point where V = E, and we may as well choose this as the reference 
point, c. For definiteness let us suppose that it is at the extreme right-hand 
edge of the permitted classical region. 

To the right of the classical region the solution should be given by a 
decaying exponential, so the appropriate form of solution is 


C|E — V(z)|-4 exp (- / : V2m|E — V(a)) az/h) (15.24) 


which we can obtain by consistently choosing either the upper or the lower 
signs in our non-classical solutions and taking 


As = Ce*4, (15.25) 


This means that the appropriate combination to use in the classical region 
is 


2018 —V(a)~t cos (= f/f Veal de + 7) 


= 20[E — V(x)|~ cos ee) ~ ; 

(15.26) 
For a turning point, 6, on the left-hand edge of the classical region one 
obtains by a similar argument the semi-classical wave function of the form 


2B[E — V(x)|~4 cos (Se 20! _ i) ‘ (15.27) 
for some constant, B. However, when there is a single classical region 
bounded by the two points b and c, these two expressions must coincide, 
and, in particular, we must have 

W(z)-W(b) mm W(x)-W(c) o 


h Qe pe ge na (15.28) 


254 THE SEMI-CLASSICAL APPROXIMATION 


for some integer n, or, on simplifying, 
W(c) - W(b) = (n r 5) his. (15.29) 


In fact, since the left-hand side is positive that integer must be non- 
negative, and we have the following result: 


The Bohr-Sommerfeld condition 15.3.1. Suppose that the po- 
tential V is less than FE just on the interval (b,c), with equality V = E 
at the endpoints Then for a consistent semi-classical solution we re- 


quire that “ : 
i J2m[B —V(s)|ds = (n + 3) eh 
b 


for some non-negative integer n. 


The Bohr-Sommerfeld rule is more usually expressed in terms of an 
integral round a closed curve in phase space, that is from 6 to c using the 
positive square root and back again using the negative root, which doubles 
the integral and gives 


$ V2m|[E — V(s)]ds = (n+ 4) 20h. (15.30) 


This rule was used as an ad hoc way to do quantum calculations be- 
fore the development of a consistent quantum theory by Heisenberg and 
Schrédinger. Not surprisingly, since it is a patchwork of approximations, it 
often leads to incorrect conclusions. 

The above argument hinged on the behaviour of the semi-classical solu- 
tions near a turning point, but we have already noted that the whole basis 
for the approximation breaks down, because the term that we dropped to 
get the Hamilton—Jacobi equation can no longer be safely ignored. Fortu- 
nately, another approximation, which we shall now describe, comes to the 
rescue near the turning points, and can be used to justify our formulae. 
Suppose that c is a turning point. Then, to first order, 


V(x) ~ V(c) + (a —0)V"(c) = B+ (x& —c)V"(c). (15.31) 


This gives as an approximation to Schrédinger’s time-independent equa- 
tion: 
hn? dp 


— ao Gaz t (@-OV'(o)p = 0. (15.32) 
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This Airy equation can be solved by Fourier transforming as in Section 
3.5, and we shall assume, for definiteness, that V’ (c) is positive. Using the 
solution given by equation (3.44) with eF = V’ (c) and EF =eFe=cV'"(c), 
we have 


~ 3 
v(p) = Nexp Cota = 2) : (15.33) 


Inverting the transform and introducing a new constant B we obtain 


y= Bf exp li (ote —c)+ an) dp. (15.34) 


We are interested in the behaviour of this integral as one moves away 
from the turning point and attempts to weld it to the semi-classical solu- 
tion. For large x the integrand oscillates very rapidly and, long before the 
advent of quantum mechanics, Lord Kelvin had realized that interference 
effects would mean that most of the contributions would cancel. Only near 
the extrema of the exponential would the oscillations be slower, so that 
the dominant contribution to the integral comes from there. (This tech- 
nique for finding asymptotic forms of integrals is known as the method of 
stationary phase.) 


Now the phase &(z,p) = p(x — c) + p?/6mV"(c) has its extrema when 


p 
5 + V'(c)(x — c) = 0, (15.35) 


which is the linear approximation to 


ca 


5 + Via) = &. (15.36) 


Let us first look at the classical region where x < c, and this equation has 
real solutions for p. At these points, ps, the phase takes the value 


®(x, ps) = £2./2mV"(c)(a — c)?. (15.37) 


By integrating the expression for E — V we see that near the turning point 
this is approximately 


AW (2) ws / * im(E=V) de, (15.38) 


so that we have recovered the semi-classical expression for the phase. 
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The amplitude can also be recovered by changing variables near the 
stationary values p4 to p = ps +u. Dropping higher order terms from the 
Taylor series, this gives 


®(x,p) = B(x, ps) + 58" (x, ps)v?, (15.39) 
where the prime denotes a derivative with respect to p.. We therefore have 
contributions to the integral of 

B [ exp E (®(z, ps) + 10"(erw)| du. (15.40) 


The integrand resembles a normal distribution function whose variance is 
—th/®". We would expect this to integrate to give us 


JV 20h" (x, ps.)e**s , (15.41) 


and this may be justified with appropriate contour integrals. Introducing 
a new constant C’, we are left with 


Cetth (G")-} efPlars), (15.42) 
We can directly calculate that 
mV"(c) 


8" (x, ps.)—4 = /V"(e)/ps = yf ——* (15.43) 


2(c— 2)’ 
which is approximately a multiple of the semi-classical expression for the 
amplitude, a. We have already noted that ®(z,p4) is approximately given 
by the same integral as W, so this confirms that we can obtain the same 
result by a method that is valid near the turning points. 

For the non-classical region, 2 > c, we allow p to become complex in the 
integral in equation (15.34). The integrand is holomorphic, and by standard 
complex variable techniques the integral can be shown to be identical to 
that along the hyperbolic contour on which | 


p = —t/2mV"(c)(x — c)3 (we + we) , (15.44) 


where w = exp(27i/3). A straightforward substitution shows that on this 
contour the integral reduces to a multiple of 


e-3 seve fort 2mV'(c)(a—c)*[cosh(3t)—1] (, 9 — went) dt. 


(15.45) 
This shows immediately that one has a decaying exponential of exactly the 
form derived earlier. More careful analysis of the integral gives the other 
factors of the previous calculation. 

In recent years the semi-classical approximation has been given a firm 
foundation in differential geometry. The behaviour near the classical turn- 
ing points is of particular interest, since it is linked with the mathematical 
study of singularities. 
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Exercises 


15.1° Use the WKB approximation to obtain estimates of the energy levels 
of the harmonic oscillator with potential dmw?z?. 


15.2 Calculate the additional potential, —h?a” /2ma, in the case of an 


harmonic oscillator eigenstate. Show that this tends to a constant if 
the energy is large. 


15.3° Derive the equations for the WKB approximation for a stationary 
state of a one-dimensional system with energy £. Solve the equations 
and deduce that the WKB approximation gives an exact solution in 
this case if and only if the potential V(z) is given by 


V(2) = E~ (ox + 6)-4, 


for some constants a and f. For which potentials does the WKB 


approximation give an exact solution of Schrédinger’s equation for 
all energies £7? 


16 Systems of several particles 


One now ceases to understand the quantum theory at ail. 
WERNER HEISENBERG, letter to Wolfgang Paull, 9 October 1923 


16.1. Identical particles 


One often wants to study systems, such as the helium atom of Sections 12.3 
and 14.2, that contain more than one particle. We describe two-particle 
systems by a wave function, ~(r1,r2), where r; is the position of the first 
particle, and rg that of the second. If we interchange the two particles, 
then the wave function becomes ~(r2,r1). Suppose, however, that the two 
particles are absolutely identical, and that there is no way of distinguishing 
whether the first is at r, and the second at r2 or vice versa. (So far as is 
known, this is the case for electrons and many other subatomic particles.) 
The two wave functions ~(ri,rz2) and #(re,r1) must then represent the 
same physical state, and so must be multiples of one another. In other 
words, we can write 


(ri, r2) = A¥(r2,71), (16.1) 


for some complex number . Since we have no way of knowing which 
particle is where, we must also have w(re,r1) = Av(ri,r2), which means 
that 

o(r1, 42) = A?W(r1, 72), (16.2) 


and tells us that 4 = +1. Moreover, as we shall now show, all wave 
functions must give the same choice of sigh for 4. For, if ~ changes 
sign when its arguments ere interchanged and #4 does not, then #4 + y_ 
changes to ~4 — ~~ when its arguments are transposed. These can only 
be multiples of each other if one of , and ~ vanishes, which means that 
one or other sign cannot occur. 

It is now a simple matter to consider systems with an arbitrary number, 
n, of indistinguishable particles. We cannot tell which of the particles is 
at which point, so for any permutation + € S,, any physically acceptable 
wave function must satisfy 


vr, T2,-+: Tn) = Ar) b(r7(1) Tu (2)s tee iT n(n)) (16.3) 


for some A(7) € C. If we follow one permutation, 7, by another, o, then 
we see that 


Nor) = A(o)A(z). (16.4) 
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(In the terminology of Definition 9.1.1, \ defines a one-dimensional repre- 
sentation of the symmetric group S,.) From this we deduce that 


Momo™1) = AMe)A(z)A(o)“! = A(r), (16.5) 


and so we see that conjugate elements give the same scalar A(7). Now, any 
transposition is conjugate to the interchange of 1 and 2, since, for example, 


(rs) = (1r)(2s)(12)(2s) “(1 r)7}, (16.6) 


and we therefore deduce that either every transposition, 7, gives A(T) = 1, 
or they all give A(r) = —1. As librarians know only too well, the symmetric 
group is generated by transpositions, so in the first case every permutation, 
m, has A(7) = 1, whilst in the second A(m) = 1 for even permutations (those 
which are products of even numbers of transpositions), but A(7) = ~1 for 
odd permutations. This proves the following result: 


Proposition 16.1.1. If the wave function y is sent to a multiple of 
itself by every permutation 7 in S,,, then either 


v(ri, T2ys++y In) = W(re(1)s En(2)r-++> I'n(n)) 


v(r1,T2, see Tn) = (x) P(rn(1) 1: Tn(2)> see iln(n)) 


anh = 1 for even permutations 7 
~ \—1 for odd permutations 7. 


Both these possibilities occur in nature and they divide the known par- 
ticles into two classes. 


Definition 16.1.1. If (ri, re,...,Pn) = P(Pr(1):8x(2)> ses iPx(n)) for 
all permutations 7 we call the individual particles bosons, or say that 
they satisfy Bose-Einstein statistics. If for odd permutations, 7, the 
wave function satisfies ¥(r1,r2,-.-;Pn) = —W(la(1):Tn(2))--+2¥n(n))s 
then we call the individual particles fermions, or say that they satisfy 
Fermi-Dirac statistics. 
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Example 16.1.1. Of the known particles, electrons, protons, neutrons, 
and quarks are fermions, whilst photons and mesons are bosons. Whether 
a particle is a fermion or a boson is linked to its intrinsic spin. (Just. as 
the electron is described by two-component wave functions corresponding 
to the value | = 3 of the angular momentum, so in general one requires 
(21 + 1)-component wave functions and talks of the particle having spin | ) 
It can be shown that particles with integral spin should be bosons, and 
those with non-integer spin should be fermions, a result known as the spin- 
statistics theorem. This was proved by Jacobus de Wet in 1940 and the 
first published proof given by Pauli ten years later. 


16.2. Bosons and fermions 


The behaviour under permutations places strong restrictions on the sort 
of wave functions that can represent boson or fermion states, but it is 
nonetheless quite easy to construct wave functions of the appropriate kind, 


by averaging. 


Proposition 16,2.1. For any wave function, y, on R?”, the wave 
function 


(Qa.v)(r1, YQ, ike ‘ Tn) = = > Am)Y(rec)s Pr(2)s-+- iE n(n)) 


TESn 


satisfies (Qy¥)(r1, 82, --- stn) = A(t)(Qab) (Pay Pa(2)1-- +s Fa(ny) for 


all + € S,. Q) is & projection. 
1 


Proof. This is proved by direct calculation. For any permutation o € S, 
we have 
1 
(Qav¥) (rea): To(2).++°) To(n)) = rn s A(T) PCr (1): TPro(2)s-+ +5 Vro(n))s 


ESp 
7 (16.7) 
and as 7 runs over all permutations 7a does too (albeit in a different order), 
so we may introduce p = wo and rewrite the sum on the right as 


4 S> Ao) (Poa) Fo(2)>+ +++ Fo(ny) = AC)" *(Qav)(r15 Fa, +++ Fn): 
965. (16.8) 
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We also note that if each permutation, am, of the arguments of a wave 
function ¢ merely introduces a factor A(z), then, since X(7)? = 1, each 

factor in the sum defining Q)¢ is identical, and since there are n! terms — 
we have Q)¢ = ¢. Applying this to d = Qyy, we see that Q3p = Qnry, 
showing that Q) is a projection. Oo 


By taking \ = 1, we see that if 


1 
(Qiyv)(r1, T2,--- Tn) = a > went), Pr(2)o+++s I'n(n)) (16.9) 
TES 
is not identically zero then it is a bosonic wave function, and, similarly, 
taking \ = € 


(Qeb)(r1,22)---8m) = YD emmys Tee)---s Fem) (16.10) 
*nESn 
is a fermionic wave function, provided it is not the zero function. If p is 
already bosonic, or fermionic, then Q,y = or Qt = , respectively. 
When n = 2 the formulse simplify considerably to give 
(Qi) (r1,r2) = 4 [W(r1, r2) + ¥(r2,11)} 
(Qe) (1,72) = 4 (W(r1,r2) — o(re,1r1)]. 
In this case we can express the original wave function as the sum of Qi 
and Q,¥, but this fails for larger n. There have been various attempts to 
build theories that include other kinds of particle statistics, but, although 
these have other uses, they do not seem to be needed for the basic particles 
observed in nature. (In some parts of solid state physics, where electrons 
are effectively confined to a surface, it is useful to allow the possibility 
that transposition of the particle positions changes the wave function by 
a factor, A, which is neither 1 nor —1. This gives rise to so-called anyon 
statistics.) 
One interesting case of the above expressions for bosonic and fermionic 


wave functions occurs for a separable function ¥1(r1)y2(r2). Its fermionic 
projection is 


Qeb(r1, 42) = 5 [Y1(r1)¥2(r2) — 4a (r2)e(r1)] = 4 


(16.11) 


Wi(r1)  e(r1) 
vi(r2) wW2(re) 
(16.12) 
It is easy to generalize this to n particles, to see that if W(¥1,82,---)?n) = 
T]j=1 Ys (r;). Then 


Yilri) we(ri) --- %a(ri) 


QeW(01,82)...,%_) = valta) Yara) ean Yala) | 


= (16.13) 


ren) o(tn) ee Ynltn) 
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This form is sometimes called a Slater determinant. It immediately leads 
us to Pauli’s exclusion principle: 


Theorem 16.2.2. If any two wave functions ~; are the same then 


Qe (Witle...%n) vanishes. 


Proof. A determinant vanishes if two of its columns are the same. Oo 


Pauli’s exclusion principle may be paraphrased to say that two fermions 
cannot simultaneously occupy the same state. (The reader may be won- 
dering why we did not mention this when we discussed the helium atom, 
and be uneasy that we used a zeroth-order wave function in which both 
electrons were in the same ground state. This is permissible because we 
neglected the electron spin. Had we included it we should have had one 
electron in one spin state and the other in an orthogonal state, thereby 
avoiding a clash with the Pauli principle.) In fact it is possible to extract a 
more powerful result by the same procedure. Suppose that the individual 
wave functions, ~;, have to be selected from an N-dimensional space, such 
as the space of wave functions corresponding to a particular, degenerate, 
energy level. The properties of a determinant mean that it clearly vanishes 
if any of its columns are linear combinations of the others. 


Theorem 16.2.3. Suppose that n fermions can each be described 
by wave functions in an N-dimensional space. Then the number of 


independent fermion states of the n particles is Ce }; 


Proof. Let {v1, %2,..., Yn} be an orthonormal basis for the space of sin- 
gle fermions. Any state of n fermions can be constructed as a combination 
of Slater determinants of size n. The order of the columns is unimportant 
since it can only affect the sign, and there are Cy ways of choosing a set 
of n columns from the N basis vectors. Oo 


We may similarly count states of n bosons. 
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Theorem 16.2.4. Suppose that n bosons can each be described 
by wave functions in an N-dimensional space. Then the number of 


independent boson states of the 7 particles is Ce tee a 


Proof. In this case the bosonic part of the product #1... a, is 
do da (tacay)¥2(try) --- Pn (Pnin))- (16.14) 


There are again N independent choices for each w;, but this time repetitions 
are permitted, so, typically, »; might occur k; times. The number of 
independent choices is therefore the same as the number of ways of taking 


products of monomials in N variables, 13,...,2y, that is expressions of 
the form 
cht. gk (16.15) 


whose total degree ky +ko+...+ky isn. It is easy to see that these terms 
arise as the coefficient of the s™ term in 


Do (sx1)** (saa)... (say) = [] do(s2;)* = [[(@ - s2,)7?. 
ky,...,4N 3 j 
(16.16) 
On setting x; = 1 for all j, the right-hand side becomes just (1 —s)-%, 
whilst the coefficient of s” on the left-hand side counts the number of 
possible terms of total degree n. The binomial theorem tells us that 


(l-a)-* =~ ea a ~ ‘3, (16.17) 


from which the result follows. QO 


16.3. The periodic table 


Despite its simplicity Pauli’s exclusion principle is of crucial importance 
in the physical world, for it prevents the fermions of which most ordinary 
matter is composed from accumulating together in the lowest energy state. 
Without it solid matter could not exist in stable forms. Moreover, it de- 
termines the nature of the elements of which ordinary matter is composed. 

Let us first consider a hydrogen-like atom, but taking into account elec- 
tron spin and the exclusion principle, both of which were neglected pre- 
viously. As we observed in Section 8.8 the electron has spin 5, and its 
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wave function should really be C? valued. (We may think of the two basis 
vectors as representing states with spin up and spin down.) There is now a 
two-dimensional space of possible ground states for the electron, spanned, 
in the notation of Proposition 4.3.1, by 


P100(r) a) and = r00(r) () . (16.18) 


Similarly there are now 2n? states with energy Ze? / Brena, since each 
of the n? independent wave functions can be combined with two possible 
spin states. As we add more electrons they tend naturally to fill the lowest 
energy states first. However, Pauli’s exclusion principle stops them from all 
occupying the ground state, as there is room for only two. After the first 
two electrons, the first excited state will be filled, where there is room for 
2x 2? = 8 electrons, and so on. We thus see the periodic table building up, 
with hydrogen and helium in the first row, followed by the eight elements 
lithium, beryllium, boron, carbon, nitrogen, oxygen, fluorine, and neon in 
the second, and so on. Of course, we have oversimplified by ignoring the 
interactions between electrons and by assuming that the energies of the 
states are independent of the electron spin, which is not quite true. With 
such refinements it is possible to understand the whole of the periodic table 
in terms of the above principles. 

Moreover, we can also understand the notion of valence. For example, 
oxygen has room for two of its eight electrons in the ground state whilst 
the remainder can occupy six of the eight available places amongst the first 
excited states. This leaves two more places amongst the first excited states 
for electrons poached from other atoms, and is responsible for oxygen’s 
valency of 2. Of course, the above discussion is rather vague and does not, 
for example, immediately explain why two oxygen atoms can bind together 
to form an oxygen molecule. Nonetheless, it can be refined with a more 
precise mathematical analysis based upon detailed Schrédinger equations 
coupled with the requirement that the n-electron state vector must be 
antisymmetric. Although in practice the resulting equations can only be 
solved numerically, the principles involved seem to be correct, so that one 
might almost say that chemistry is an exploration of the consequences of 

i’ sion principle. 
ra Se ELE ie not the only important manifestation of Pauli’s 
exclusion principle. One much more extreme example is provided by neu- 
tron stars. In younger and middle-aged stars like our sun, fusion processes 
convert hydrogen nuclei into those of heavier elements, and the radiation 
pressure from these thermonuclear reactions is sufficient to stop the star 
from collapsing under its own gravitational attraction. Eventually, how- 
ever, when all the possible thermonuclear fuel is exhausted and after a 
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complicated series of transformations, the gravitational forces start to win. 
The star's atoms, ionized aeons earlier in its thermonuclear furnace, are 
now subjected to conditions so extreme that their nuclei and electrons in- 
teract to form neutrons (and other particles that are radiated away). These 
are compressed by the gravitational forces into a neutron star. However, 
unless the mass is so great or confined within so small a radius as to form 
a black hole, there is then a respite, because the neutrons, being fermions, 
cannot all occupy the state of least gravitational energy, to which they 
would otherwise be inexorably pulled. 


16.4. Bose—Einstein condensation 


By contrast with fermions, bosons are gregarious and occupy the same 
state more often than one might have expected. For example, photons are 
bosons, and lasers create large numbers of them in the same state. At 
sufficiently low temperatures large numbers of bosons will all fill the lowest 
energy state, in a process known as Bose-Einstein condensation. In 1995 a 
team of physicists in Boulder, Colorado, succeeded in cooling around 2000 
rubidium atoms to a temperature of 2x 10~7 degrees above absolute Zero, SO 
that they all condensed into a single clump of around 10-4 metres diameter. 
Even earlier, however, the effects of Bose-Einstein condensation had made 
themselves apparent in an indirect way in the anomalous behaviour of liquid 
helium. Helium is found in two isotopes, 7He and 4He. The commoner 
isotope “He, of atomic weight 4, has a nucleus made of two protons and 
two neutrons. Since all these constituents have spin T the nucleus has 
integral spin and is a boson. At temperatures below 2K (well below its 
boiling point of 4K) it becomes o superfluid, able to climb up the walls 
of any container, and having virtually no viscosity. The nucleus of the 
rarer 2He contains only one neutron, and so it is a fermion. It exhibits 
anomalous behaviour only below about 0.002 K, when the nuclei pair up to 
form bosons, 

In 1911 Kamerlingh Onnes discovered that some materials appear to 
lose all resistance to the flow of electric currents at low enough tempera- 
tures, a phenomenon known as superconductivity. Leon Cooper suggested 
in 1956 that this was the result of a mechanism, similar to that in 3He, 
whereby the electrons carrying the currents pair up. These Cooper pairs, 
being bosons, can show Bose-Einstein condensation, which enables them 
to carry the current freely. It was long believed that the electron pairing 
was caused by interactions with the crystal lattice of the superconductor. 
This was the BCS theory, elaborated by Bardeen, Cooper, and Schrieffer, 
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and for which they received the 1972 Nobel Prize in Physics. However, this 
theory suggested that no material could be superconducting above about 
25K. The discovery of high temperature superconductors in 1985, whose 
structure was far more complicated than the previously known materials, 
showed that this could not be the whole story. It is believed, however, 
that the idea of some kind of pairing of charge carriers (not necessarily the 
electrons) is still valid. 

In 1962 Brian Josephson, then a student at Cambridge, realized that 
Cooper pairs would be able to tunnel through a sufficiently thin insula- 
tor placed in a superconducting circuit. These ‘Josephson junctions’ form 
the basis of superconducting quantum interference device or SQUID, an 
extremely sensitive tool for measuring magnetic fields. SQUIDs exhibit 
quantum effects on an everyday scale, and have been proposed for some 
extremely sensitive tests of quantum mechanical effects. 


16.5*. Tensor products 


So far our description of many-particle systems has been limited to wave 
functions, but often a more algebraic approach is needed. In the general 
case where the two quantum states are described by vectors in spaces H, 
and He we want to construct an inner product space H, whose vectors 
represent states of the combined system. The most obvious approach is to 
choose orthonormal bases {€;};ey and {7 }kex for Hi and He, and then 
pick any inner product space H with an orthonormal basis of the form {¢;, : 
(j,k) € J x K}. This is consistent with the approach for wave functions, 
since in that case we could simply take for ¢,4(r1,r2) the separable wave 
function, €;(r1)7(r2). Although not every wave function of two variables 
is separable, we expect that they can all be expanded as an (infinite) sum 
of separable solutions, so we would expect 1 to include all two-particle 
wave functions. In order to distinguish the kind of product appearing in 
separable solutions, ¢;.(r1, 2) = €j(r1)7%(r2), from the ordinary pointwise 
product where we multiply the values of €; and 7 at the same point, we 
shall henceforth write it as ¢;, = €;@m,. (The symbol @ is read as tensor.) 

Unfortunately our construction of H is quite arbitrary. Had we cho- 
sen different bases for H, and He it is far from obvious that we should 
have ended up with the same space 7 for the combined system. It is 
therefore more useful to try to characterize the space intrinsically without 
using bases. Intuitively we expect H to incorporate both the single-particle 
spaces, and to be the smallest space with this property (that is, any other 
space that contains both the single-particle spaces should also include H). 
These two requirements correspond to the two conditions of the following 
definition. (We recall that a bilinear map from H, x H2 to K is a map that 
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is linear in each of its two arguments.) 


Definition 16.5.1. Let H and Hg be inner product spaces. Suppose 
that there exists a space Hi; ® H2 with the following properties: 

(i) there is a bilinear map from H, x He to H, @ He written as 
(1, 2) + v1 @ 2} 

(ii) for any bilinear map, (, from H, x He to an inner product space 
K, there is also a unique linear map # from H; @ He to K satisfying 


Bh @ We) = B(hr, 2). 


Then 711 ® He is called the tensor product of Hy and He, and its 
elements are called tensors. The inner product on H, @ Hg is defined 
by 

($1 ® galt @ 2) = (b1}b1) (P2|th2). 


It is fairly clear that the space H constructed in terms of bases has these 
properties. Any vectors in H, and Hz can be expanded as ¥ = > 236; 
and #2 = >> yx, and we have the bilinear map 


(Yr, 42) 4 >> wyUR CR (16.19) 


This means that Ck = €; @ my. Given B: Hy x He — K we need a linear 
transformation @ : 7H — K which must in particular satisfy BE; @ 1) = 
B(é;, nx). This forces us to take 


B (© zixCjn) = Do zB (sm); (16.20) 


so that @ is uniquely defined on a general element of H. Moreover, it is 
easy to check that it satisfies all the requirements of the definition. This 
shows that our definition is not vacuous, but we must now check that there 
is only one possible space 1 @ Hp. 


Proposition 16.5.1. Any two tensor product spaces satisfying the 
requirements of the definition are isomorphic. 


1 
mectisinad 
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Proof. Suppose that besides 1 ® H2 the space 71; © He also satisfies 
the provisions of the definition. By the first of the two conditions there is a 
bilinear map from #1 x He to Hi © He, which we denote by ©: (#1, ¥2) 
v1 © 2. But then, by the second part of the definition of H, ® He, there 
exists a linear map © from H; @ He to H1@ He. Reversing the roles of the 
two contenders for the tensor product we also have a linear map ® from 
Hi © He to Hi @ He. Since each is unique, it is easy to see that © and 
® must be mutually inverse maps, and so set up the required isomorphism 
between the two spaces. oO 


The construction of H = H; @ Hg in terms of bases immediately gives 
us its dimension. 


Proposition 16.5.2. If H1 and He are finite dimensional then the 
dimension of the tensor product is 


dim(H1 @ H2) = dim, dim He. 


TENSOR PRODUCTS 269 


Since 
BAP) = BL, AY) = dB(LY) = BAY), (16.22) _ 


this map satisfies the compatibility condition, and it is easily seen to be 
unique, so the result follows. The proofs of the other parts follow the same 
lines as these. The isomorphism in the last part sends £ @7 € He ® Hf to 
the linear transformation, 7) € Hy, » n(p)é. | 


The above proofs show that the real purpose of the second condition in 
the definition is really to enable us to extend to the whole of the tensor 
product space maps which are initially defined only on elements of the form 
Y1 @ we, and this is why we only needed to define the inner product for 
tensors of the form 71 @ ye. 


Definition 16.5.2. Vectors that can be written in the form ~; ® we 


are called decomposable tensors. 


‘We can also easily prove some other useful properties. 


Proposition 16.5.3. For any spaces Hi, Hz and H, H*, the dual 
of H, and £(H1, He), the linear transformations from H to He, we 
have : 
(i) COHZHSHEC; 


(ii) Hi @ (He @ H) & (Hi @ He) @H; 
(iii) 4 @ (Hi ® He) & (H @ Hi) © (H@ Ha); 
(iv) He @ HY = L(Hi, He). 


Proof. (i) Now that we know that the tensor product space is essentially 
unique we need only check that # satisfies the properties required of C@H 
to know that they are isomorphic. Now the scalar product that takes (A, ¥) 
to Av is a bilinear map from C x H to H. Given any bilinear map (@ from 
C x ‘H to K we may define a linear map § from H to K by 


B(b) = B(1,¥). (16.21) 
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As already mentioned decomposable tensors are the algebraic analogue 
of separable solutions of a differential equation. Just as most solutions 
are not separable, most tensors are not themselves decomposable, but are 
linear combinations of decomposable tensors and this fact leads to the phe- 
nomenon known as entanglement, whereby the properties of one particle 
depend to some extent on those of another. We saw an example of this 
in the case of the two Einstein~Podolsky—Rosen photons in Section 10.4. 
Entanglement posed one of the major obstacles to the calculation of the 
energy levels of helium and more complex atoms before the development 
of quantum theory. Although perturbation and variational methods start 
with separable wave functions, it is easy to check that the true wave func- 
tions do not decompose in that way. 

We can, of course extend the notion of tensor product to more than just 
two particles. 


Definition 16.5.3. Let 711, He,...,;#n and K be inner product 
spaces. 


A map from Hi x Hz x... X Hn to K which is linear in each 
argument is said to be n-linear. 
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Definition 16.5.4. In the same notation as the previous definition 
suppose that there is a space, 1 @ H2 ®...®@ Hn, that satisfies the 
two conditions that 

(i) there is an n-linear map from 


Hy x He x... X Hn 7 Hy @H2®... OH, 


which takes (71, H2,-.-,Wn) to the vector 41 @ 42 @... 8 dn; 
(ii) for any n-linear map # from Hi x He x... Hy to K there is a 


linear map § from Hy @ He ®...@ Hp to K such that 


B(b1 @... @ Vn) = B(H1, ba, -- +s Pa); 


then 11 @H2®...@Hzy is said to be the tensor product of the spaces 
Hy, He,-..;Hn- 


The inner products are given by the obvious product 


($1@...@ bali @...@ Yn) = [] (olds). 


We also need to consider observables in quantum mechanics, which means 
looking at operators on tensor products. If A; is a linear operator on H;, 
for 7 = 1,...,n, then there is an obvious operator A; ® Az ®...@ An on 
Hy ® He @...@ Hn, defined by 


(A1@A2®. . -@An)(H1OY2®. . -@vn) = Avi @A2h2®...@AnYn. (16.23) 
| 
It is common to write A(k) for this product when A; = A and A; = 1 for 
j # k. (Thus, for example, one writes A(1) = A@1 and A(2)=1@A.) 


16.6*. Symmetric and antisymmetric tensors 


Tensor products provide an algebraic description of systems of several par- 
ticles, whether they may be distinguished or not. We shall now consider 
how to describe fermions and bosons algebraically. There are two equiv- 
alent, but slightly different, ways of doing this. The first is to build the 
permutation symmetries into the definition by using only those multilin- 
ear forms that have the correct behaviour under permutations. Since they 
are indistinguishable each individual particle is described by vectors in the 
same inner product space Hg. 
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Definition 16.6.1. Let 7o and K be inner product spaces. An 
n-linear map 
B:HoXHox...XHpooK 


is said to be symmetric if Bi, £2, ran &n) = B(Ex(1)s &e(2)1 see 1Ex(n))s 
for all 7 € S,, and antisymmetric if 


BA, £2, oes 1€n) = e(7) (Ena), Ex(2)s seed Ex(n))s 


where e€ is defined as in Proposition 16.1.1. 


Definition 16.6.2. Let Ho be an inner product space. Suppose that 
there exists a space @SHo = Ho @s ... @s Ho with the following 
properties: 

(i) there is a symmetric n-linear map from Hg x ... X Ho to Ho 
written as (¥1,.--,%n) > W1 Os... Bs Pn; 

(ii) for any symmetric n-linear map § from Ho x... x Ho to a space K, 
there is a unique linear map ( from @Ho to K satisfying B(y, @s 
-- Os ¥2) = B(y1,..., Wn); then @&$Ho is called the n-fold symmetric 
tensor product of Ho. 


Definition 16.6.3. Let Ho be an inner product space. Suppose 
that there exists a space A” Hy = Ho A... A Ho with the following 
properties: 

(i) there is an antisymmetric n-linear map from Ho x ... x Ho to 
A” Ho written as (41,...,%n) 4 W1A...AVn} 

(ii) for any antisymmetric n-linear map f from Ho x ... X Ho to a 
space K there is a unique linear map f from A" Ho to K satisfying 


Bi A... A 2) = Blv1,...;%n); then A” Ho is called the n-fold 
exterior product of Ho. 


We have set things up so that bosons and fermions can be described 
by elements of the symmetric tensor product and exterior product, re- 
spectively. They can also be described by elements of the ordinary tensor 
product that have a particular symmetry. To see this we note that there 
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is an obvious action of the permutation 7 € S, on decomposable tensors, 
given by 


U(r) (€1 ® 2 ®..- @ En) = En(1) @ Sa(2) @-»- @ Ex(n)- (16.24) 


Since the tensor product space is spanned by decomposable tensors, we 
can extend U(z) linearly to the whole space. We leave it to the reader 
to check the following result, which follows from the fact that, like the 
ordinary tensor product, the symmetric and exterior product are unique | 
up to isomorphism. 


Theorem 16.6.1. The symmetric tensor product can be identified 
with the subspace of  € H, ®...@H,, that satisfies U(r) = 9%, for 


all permutations 7, and the exterior product with the subspace of 
that satisfies U(1) = —% for odd permutations. 


We can now deduce that the other properties of tensor products given 
in Proposition 16.5.3, such as associativity and distributivity over direct 
sums, carry over to the exterior and symmetric tensor product. 


16.7*. Tensor products of group representations 


One often has a symmetry group G acting on the spaces of the individual 
particles and one wishes to know how this symmetry is reflected in the 


combined system. ; 


Definition 16.7.1. If U, and U2 are unitary representations of G on 
Hy and He respectively, then there is a tensor product representation 
U =U; @ U2 on H = Hi @ He defined by 

U(z) (41 @ Y2) = Ui(x)y1 @ Ua(z)p2 


for x in G, #1 in 741, and 2 in He. 


This definition can similarly be extended to products of more than two 
spaces. There is a very simple formula for the character of a tensor product 
representation. 


\ 
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Proposition 16.7.1. Let x1 and x2 be the characters of the rep- 
resentations U; and U2; then the character of U; ® U2 is given by 


x(g) = x1(9)xa(g), for allig eG 


Proof. From orthonormal bases €; and 7, for 1 and He we may con- 
struct the orthonormal basis €; ® n, for H, ® Hz. Then 


tr(U1(g) @ Ua(g)) = S7(& @ nelWi(9)& @ Va(g)ne) 
gk 


= SO (Es|U1(9)&5) (me{U2(g)ne) 
gk ; 

= D2 Es1Ua(9)&5) S- (ne V2(9) me) 
j 


k 
= x1(9)x2(9) (16.25) 


as asserted. oO 


Remark 16.7.1. By Corollary 9.8.3 we already know that any finite- 
dimensional representation U of the rotation group can be decomposed as 
a direct sum of irreducible subrepresentations. Its character is the sum of 
irreducible characters and so the coefficients n; will just give the multiplicity 
or number of times that D! occurs in the decomposition of U. This must 
apply in particular when U is the tensor product of representations, for 
ra when U is the tensor product of two irreducible representations 
QD. 


Theorem 16.7.2. (The Clebsch—Gordan series) The tensor 
product of irreducible representations of the rotation group decom- 


poses as 
D* @ D' & DIF @ DIA-H+1 gg DET. 


Proof. Recalling the expression in Theorem 9.8.1 for the irreducible char- 
acters of the rotation group, the character of the right-hand side is 


k+l k+l 


>= A;(0) = cosec? (40) > sin ($6) sin [(7 + 3) 4] 


j=|k-l| jH|e—l 
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k+l 
= }cosec” ($6) > {cos(j@) — cos{(j + 1)6]} 
jalk—t| 


= kcosec” (46) {cos (|k — 19) — cos {(k +1 +1) 6]} 
= cosec? (46) sin [(k + 4) 6] sin [(i+ 4) 6}, (16.26) 


which is just A;,(@)A;(9), the character of D* @ D!, as required. a 


Example 16.7.1. D!@D!' = D419 D' 9 D!-1, 


Example 16.7.2. D?® Di & D! @ D®. This shows that two spin 3 
particles have combined angular momentum 0 or 1. This is in line with 
intuition, which suggests that their spins either line up to give 4 + 3 =1 
or are in opposite directions giving 3 — 3 = 0. 


Theorem 16.7.3. If Uy, and U2 are representations of R. with in- 
finitesimal generators H, and H2 respectively then the infinitesimal 


generator of U; ® U2 is H, @14+1@ Hp. 


Proof. The generator is 


d , : d 
= th—U, @U2 + ihU, ® —Up2 
0 dt t=0 dt t=0 


= Hi @1+1@ Fp, (16.27) 


d 
, —— 
ine (U, @ U2) 7 


as stated. oO 


16.8*. Tensor operators 


Sometimes it is possible to extract information from group representations 
even when the quantum system is not symmetric. For example, although, 
as we have seen, a uniform electric field breaks the rotational symmetry of 
an atomic Hamiltonian, it does so in a rather controlled way. 
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Definition 16.8.1. Let U, W, and D be unitary representations of 
G on the spaces H, K, and CL, respectively. A tensor operator for 
(U, W, D) is a linear map T from CL to the space of linear transforma- 


tions £(K, 71), such that for all v in £ and x in G 


U(x)T(v)W(2)~! = T(D(2)v). 


Example 16.8.1. If D is the trivial representation on the space £ = C, 
then T(v) = vT(1) is determined by the single operator T(1). The defining 
relation becomes 


U(x)T(1) = T(1)W(s), (16.28) 
so that T(1) is just an intertwining operator for U and W. 


Example 16.8.2. Let U = W be the representation of SO(3) on wave 
functions defined in Example 9.1.2, and D = D! the three-dimensional 
irreducible representation. Then for v in C3 we may define X (v) = Xv, 
P(v) = P.v, L(v) = L.v, where X, P, and L are the usual position, 
momentum, and angular momentum operators. Each of these is a tensor 
operator for (U,U, D1). For example, 


(U(A)X(v)U(A)~*) (r) = (X(v)U(A)~1y) (A™+r) 
= (v.(A7r)) (U(A)~*) (47) 


= (Av.r)p(r) 
= (X(Av)¥)(x), (16.29) 
so that 
U(A)X(v)U(A)7} = X(Av). (16.30) 


If we set A = R,(f) and differentiate this relationship we get the commu- 
tation relations 


[L5, Xk] = thejneX, (16.31) 
proved in Proposition 8.1.1. 
Example 16.8.3. Let U = W be the representation of the translations 


of R? on wave functions described in Example 9.1.4. For a continuous 
function f let M(f) = f(X), that is 


(M(f)b)(x) = F(x)¥o(x), (16.32) 
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- and let D be defined on functions f by exactly the same formula as U is 
defined on wave functions. Then 


(U(a)M(f)o)(x) = (M(f)b)(x — a) 
= f(x — a)p(x — a) 
= (D(a) f)(x)(U(a)¥)(x), (16.33) 
so that MM defines a tensor operator. If we take ta in place of a and 
differentiate with respect to t then we obtain the relation 


(Pla), £20) = Ha.Vs)%). (16.34) 


On taking f(x) = 2; we easily recover the commutation relations between 
momentum and position. 

Arthur Wightman and George Mackey independently suggested that 
this is the real origin of the commutation relations. If quantum theory is 
to be able to describe local interactions at a point then a position opera- 
tor X must exist. If it is to be independent of the choice of origin then 
M(f) = f(X) must be a tensor operator for the translations, and then the 
commutation relations i)... 


Lemma 16.8.1. Let T be a tensor operator for (U,W,D). Then 
there exists an operator 7' from £L ®@K to H defined for v in £ and w 
in K by 


P(v @ Pp) = T(v)d. 
Moreover, 7’ intertwines D ® W and U. 


Proof, There is a bilinear map from £ x K to H defined by 


(u,b) > T(v)y, (16.35) 
so by Definition 16.5.1(ii) there must exist a linear map T from £L®K to 
H satisfying y 

T(v@) =T(v)y. (16.36) 
Moreover, for any = in G, 
U(z)T(v @ p) = U(x)T(v)p 
= T(D(2)v)W (a) 
= T(D(z)v ® W(x)$) 
= T(D(z) ® W(z))(v @ ), (16.37) 
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from which the result now follows. QO 


Theorem 16.8.2. (Wigner—Eckart theorem) Let U be an irre- 
ducible representation of G. Suppose that D@W can be decomposed 
into irreducibles, and that U occurs at most once in the decomposi- 
tion. Let I’ be a tensor operator for (U, W,D); then for any € in H, 
nin K, and v in £ 


(E[P'(v)n) = Ar c(E, v, 7), 


where Ar is independent of the vectors €, 7; and v, and c is indepen- 
dent of T’. 


Proof. Suppose that U occurs exactly once in the decomposition of D® 
W, so that we may write 


DeW=WYeUt, (16.38) 


where Up is equivalent to U and U+ does not contain any irreducible equiva- 
lent to U. Accordingly the linear operator T' must split into Ty @T',, where 
Ty denotes the restriction to the space on which Up operates and T, the 
restriction to the space of U+. Since U is irreducible Schur’s lemma tells us 
that Ty is either 0 or an isomorphism. Moreover, if S denotes any non-zero 
intertwining operator then S~!Zy intertwines U with itself and so for some 
scalar Ap 

S7'Ty = Ar. (16.39) 


On the other hand, since U+ contains no irreducible equivalent to U, 
T; =0, so we have 7 
T= ArS @0=ArS. (16.40) 


If D@W contains no representation equivalent to U then, by the same 
argument, 7’ = 0. From this relation we see that 


(€|T(v)n) = (ElF(v ® n)) 


= (€\ArS(v @n)) . 
= Ar(éS(v @ n)). (16.41) 
The answer now follows on writing c(é,v,7) = (€|S(v @n)). a] 
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Definition 16.8.2. The numbers c(€, v, 7) are called Clebsch-Gordan 


coefficients. 


Remark 16.8.1. The point of this result is that, since the Clebsch—-Gordan 
coefficients are independent of T’ they can be tabulated for interesting rep- 
resentations. Then one needs only to find the single scalar 7 in order to 
know all the matrix elements (€|T'(v)n). This can be done by a single direct 
calculation for some convenient triple (€,v, 7). 


By Proposition 16.7.2 we have 
D* @ D! = DIk-4l @ Dik-I41 @ . @ DEt!, (16.42) 


so that the hypotheses of the theorem hold for the representations U = D?, 
W = D!, and D = D* of the rotation group. The perturbing term for the 
Stark effect, F.X = X(F), is a tensor operator. So, if we take € and 7 
to be wave functions of angular momentum j and |, respectively, then the 
theorem tells us that (€|F.Xm) vanishes unless |j —/| < 1, and can provide 
information about the non-vanishing values too. A useful tool for this end 
is the following: 


Corollary 16.8.3. Under the hypotheses of the theorem let S be a 
non-zero tensor operator for (U, W, D). Then there exists a complex 
number pr such that for any € in H, nin XK, andvin£ 


(EIT (v)n) = wr (E|S(v)n). 


Proof. Since S is non-zero the scalar Ags does not vanish, and therefore 
DN 

(IT (~)n) = Av cl, ¥,n) = FE(E1S(v)n), (16.43) 

so that we may take wr = Ar/As. o 

This allows us to calculate all the inner products involving a given sort 

of tensor operators, from those for a single tensor operator, S, such as L 


or X. This circumvents direct calculation and deals with a whole class of 
such problems at the same time. 


| 
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Exercises 


16.1° Explain the behaviour of the wave function for two identical particles 
moving in one dimension when the particles are interchanged if they 
are bosons. What happens if the particles are fermions? 
Suppose that particles, moving in one dimension, of ‘charge’ e; 
and e2 interact according to the potential 


—$ee2(X1 = X2)?. 


Write down the Hamiltonian for two spinless particles, each of mass 
m and ‘charge’ —1, moving in the field of an infinitely massive ‘nu- 
cleus’ of ‘charge’ K > 0 placed at the origin. By rotating coordinates, 
or otherwise, show that the Hamiltonian only has a ground state if 
K > 2. If the particles are bosons find the energies and multiplicities 
of the lowest two energy levels when K = 8/3. 


16.2° Show that the states of n distinguishable spin | particles can be 

described quantum mechanically on a space of dimension (2! + 1)n. 
How does the description change if the particles are indistinguishable 
bosons? 

Show that if the components of L® give the angular momentum 
of the k-th particle for k = 1,...,n, then the components of L“® + 
L®) also satisfy the commutation relations for angular momentum. 

Two spin ! particles are placed in a uniform magnetic field of 
magnitude B and the interaction energy is represented by 


4yBL® ©), 


where the components of L(*) give the angular momentum of the k- 
th particle for k = 1,2. Find the energy levels of the system and their 
degeneracies when the particles are (i) distinguishable, (ii) bosons. 


16.3° Two independent spin 3 systems have spin operators s(1) = $ho(1) 
and s(2) = 3fo(2). Show that the eigenvalues of o(1).0(2) are 1 
and --3 and find the corresponding eigenvectors. 
State the property of the state vectors representing the physical 
states of identical particles under identical particle interchange. 
The Hamiltonian for a system of two spin 5 particles, labelled 1 
and 2, is given by 


H = h(1) + h(2) — wo(1).0(2). 


The self-adjoint operator h has non-degenerate eigenvalues Ey < 
E, < E2 <... with eigenfunctions ¢o(X), 61(X), $2(x),..., respec- 
tively, and h(j) acts on the space of particle 7. What are the possible 
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energy eigenvalues for the two-particle system (i) when the particles 
are not identical and (ii) when the particles are identical? Show 
that the ground states in the two cases have the same degeneracy if 


4 > (Ei — Ep). 
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17_~—“ Relativistic wave equations 


Henceforth space by itself, and time by itself, are doomed to fade away 
Into mere shadows, and only a kind of unlon of the two will preserve an 
Independent reality. 


HERMANN MINKOWSKI, lecture at Cologne, 21 September 1908 


17.1. Minkowski space 


One serious objection to the whole of quantum theory as we have developed 
it so far, is that it is inconsistent with the theory of relativity. In order to 
see how we may overcome this defect, at least as far as the special theory 
of relativity is concerned, we shall first recall some of its basic features. 
Shortly after Einstein's original 1905 papers on the subject, Minkowski 
realized that the physical ideas could also be understood geometrically, by 
combining space and time into a single four-dimensional real vector space, 
M, whose elements we shall call four-vectors. Each observer can assign - 
coordinates (z°, z!, z?, 23) to a point z in M, and interpret x = (x, x?, x) 
as normal spatial coordinates and x° as ct, where t is time and c is the speed 
of light. (By convention, these indices are written as superscripts rather 
than subscripts.) The constancy of the speed of light gives the bilinear 
form 

g(z,y) = 2°y® — 2ty! — 2?y? ~ xy? = 2°y? ~ xy (17.1) 


242 23 


a special significance, since the associated quadratic form g(z,x) =c 
|x|? vanishes for points on a light ray through the origin. 


Definition 17.1.1. The four-dimensional real vector space, M, with 
the bilinear form, g, is known as Minkowski space. 
The group of linear transformations, A, of M such that 


g(Az, Ax) = g(z, 2), 


is called the Lorentz group, and its elements are known as Lorentz 
transformations. The proper orthochronous Lorentz group is the sub- 
group of Lorentz transformations, A, with positive determinant such 
that g(x, Az) is positive whenever g(x, x) is positive. 


a co nd ene 
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The last constraint ensures that proper orthochronous Lorentz transfor- 
mations do not reverse the direction of time, and the former then rules out 
spatial reflections, just as the corresponding determinant condition picks 
out the rotations from the orthogonal group. In fact, any rotation R can 
be identified with a Lorentz transformation that affects only the spatial 
coordinates of an observer, sending (r°,x) to (x°, Rx). The coordinate 
systems of different observers are connected by Lorentz transformations. 
We shall refer to the elements of M as four-vectors. Each observer and 
object traces out a curve, its worldline, in Minkowski space as time passes. 
(For example, the path through M of an observer stationary at the origin 
of a reference frame is the curve (ct,0).) At each point the curve has a tan- 
gent, which, parametrizing the curve by 7, is in the direction U = dx/dr. 
Since observers move more slowly than light g(U,U) is positive, and the 
parameter 7 can be chosen so that g(U,U) = c?. 


Definition 17.1.2. The parameter 7 that ensures that U = dx/dr 
satisfies g(U, U) = c? is known as the proper time along the curve and 


U is called the four-velocity of the observer. 


In the case of the static observer x = (ct,0) can be differentiated with 
respect to t, to give the tangent V = (c,0). This is already appropriately 
normalized so that, in this case, the proper time is f. 

Schrédinger’s equation was motivated by Planck’s law for energy and 
the de Broglie relation for momentum, so we need to investigate the corre- 
sponding relativistic concepts. 


Definition 17.1.3. The rest mass of a body is its mass as measured 
in a frame in which it is at rest. The four-momentum of a body whose 


rest mass is m and four-velocity U is p = mU. 


By definition, we have 
g(p,p) = g(mU, mU) = m?g(U,U) = mec’. (17.2) 


As we shall now see the four-momentum combines both the momentum 
and the energy of the body and this is the relativistic formula linking 
them. With respect to a coordinate system it says that p° — |p|? = m?c?, 
which can be rewritten as 


(p° ~ me) = |[p|?/(p° + me). (17.3) 
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For a static body on the curve (ct, 0) the four-momentum is (me, 0), so we 
expect that for slow moving bodies p should be small and p® close to me. 
Our previous relation then gives 


(p° — me) ~ |p|?/2me, (17.4) 


suggesting that we should interpret c(p° — mc) as the kinetic energy of the 
body and p as its spatial momentum. 


Definition 17.1.4. The energy of a body with four-momentum p as 
seen by an observer with four-velocity W is g(W,p). 


For the ‘static’ observer with four-velocity V = (c,0) this relativistic 
energy is 9(V,p) = cp°. It is made up of the famous constant term, mc?, 
and additional kinetic terms, which, for slowly moving bodies, are approx- 
imately |p|?/2m. 

It is easy to express a plane wave ¥(t,x) = exp[—2(wt — k.x)] in rel- 
ativistic form by introducing the frequency four-vector & = (w, ck) since 
then (x) = exp[g(x,2)/c]. We can now recognize that the Planck and de 
Broglie relations are just the temporal and spatial components of a linear 
relationship between the four-momentum and four-frequency: cp = fix. In 
fact, it is sufficient to know the Planck relationship, which can be written 
as 9(V,p) = hig(V,x)/c, since if this is true for one observer, then rel- 
ativistic invariance means that the corresponding identity must be true 
for all uniformly moving observers. That is, 9(W,p) = Tig(W,x)/c or 
g(W, cp — ix) = 0, for any four-velocity W. It is easy to see that this 
implies that cp — ix = 0, giving the de Broglie law as well. 

Einstein was originally led to develop the special theory of relativity in 
order to harmonize Maxwell’s electromagnetic equations and the laws of 
motion. The laws of electromagnetism can be expressed in terms of an 
electrostatic potential ¢ and a magnetic vector potential A, from which 
the electric and magnetic fields can be recovered as 


‘E= a — grad , B=curlA. (17.5) 
The potentials are not unique since it is possible to add a gradient, grad x, 
to A and to subtract 0x/dt from ¢ without affecting the fields. Such 
changes are called gauge transformations, and play a crucial role in the 
modern understanding of physics. Not surprisingly, these concepts have an 
elegant reformulation in Minkowski space. 
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Definition 17.1.5. Let ¢ be the electrostatic potential and A the 
magnetic vector potential. The electromagnetic four-potential is ® = 


(¢, cA). 


The significance of this is that any observer with four-velocity U will see 
an electrostatic potential g(U, @) and the component of ® perpendicular to 
U will give the magnetic vector potential. 


17.2. The Klein—Gordon equation 


The Planck-de Broglie relations mean that the plane wave can pee be 
expressed as 7)(x) = exp[—ig(p, x)/fhi], and the constraint 9(p, p) = mc? on 
the four-momentum leads to a wave equation much as in the non-relativistic 
case in Chapter 2. First we introduce a relativistic Fourier transform that 


decomposes any given wave function into plane waves 
p(x) = (ean)? [ ef9,2)/F Fy) (p) dp. (17.6) 
M 


Exploiting the Einstein summation convention that repeated Greek indices 
are summed over the values 0, 1, 2, and 3, we can differentiate to get 


snwe SE = (amt)? fof, We =F W\(p)dp, (17.7) 
giving 5 
in( Fw SE) — ote, WI Foe. (17.8) 


In particular ichO/Oz° = ihd/Ot transforms into the relativistic mae! 
cp®, and iiV transforms into —p. This agrees with the conventions use 
earlier for non-relativistic quantum theory. The constraint g(p,p) = m“c 
therefore gives the equation 


is known as the free Klein-—Gordon equation. 
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Remark 17.2.1. If the mass m vanishes then the Klein~Gordon equation 
reduces to a standard wave equation 
1 Oy 
sa - Vy) = 0. 
| 2 OB y=0 (17.10) 
In this way one can regard the ordinary wave equation as a degenerate 
quantum equation although Planck’s constant no longer appears in it. 


17.3. The Yukawa potential 
For wave functions that do not depend on time, the Klein-Gordon equation 


reduces to Yukawa’s equation, V2) = (me/h)*, and when yp = w(r) is 
spherically symmetric this becomes 


102 (rp 
. as Ds (me/h)*~, (17.11) 
whose solutions are of the form 
ry = Ae™/h 4 Be—mer/h (17.12) 


(compare Section 13.5). As usual the exponentially increasing solution is 
physically unacceptable, leaving 


1) (17.13) 


The importance of this wave function was first pointed out by Yukawa 
Hideki, who assigned it an important role in explaining the stability of 
atoms. It is known as the Yukawa potential. 

Experiments between the wars showed fairly clearly that the nucleus 
of an atom consists of positively charged protons and uncharged neu- 
trons. To explain why electrostatic repulsion between the protons did not 
cause the nucleus to explode, Yukawa proposed that there must be another 
force, now called the strong force, described by the Yukawa potential. If 
B > Ze?/4meo, then this exceeds the electrostatic potential —Ze?/4regr 
at short distances, but owing to the exponential term it is smaller at large 
distances. Such an attractive force could overcome electrostatic potential 
at short distances (less than about h/mce) whilst being small enough out- 
side the atom to explain why it had hitherto eluded detection. Just as the 
photon is associated to the electromagnetic field, particles called mesons 
are associated to Yukawa’s field, and the parameter m in Yukawa’s equa- 
tion can be interpreted as their mass. It can be estimated from the size of 
the nucleus. (In the case of the electromagnetic field the photon is mass- 
less, since m = 0.) When mesons were later detected experimentally it was 
found that there were many different sorts having different masses. The 
lightest is around 264 times more massive than an electron. 


intone earrerestan ERNE NT RMT ETEE RESPITE Ba PRR ARRGN TIVE WOOEYAN EE EYL C ELH YSU ecm Sie ty gen ca ea tanta ceyenype ee uwt weenmnaneay toy 
SrA I ARTY TETAS FUSE TEE 11° AEN NGPRFERS EAE ETO a SEE OE AY TA ARRON Sg WOME TEEN H A CML A NTPNRR erT 
aan Goa ie . . aie: 2 ft : 


fe 


286 RELATIVISTIC WAVE EQUATIONS 


17.4. The Dirac equation 


Unfortunately, despite its elegance, there are serious reasons for believing 
that the Klein-Gordon equation is not basic, but must be a consequence of 
other more fundamental equations. Schrédinger’s non-relativistic equation 
is only first order in time, and so can be solved once the starting value of the 
wave function is known. Being second order, the Klein-Gordon equation 
requires initial values of both % and 6%/0t for its solution. To avoid this 
striking difference one would need a first-order relativistic equation as well. 
(Maxwell’s equations for electromagnetism provide a good model for this. 
In empty space the fields satisfy second-order wave equations, but these are 
a consequence of the first-order dynamical equations, 0E/0t = c’curl B and 
0B/dt = —curlB.) Now, relativistic invariance means that space and time 
are on the same footing, so that the equation must have the same order 
in the time and space derivatives, and, by the above argument, that order 
should be 1. 

Fourier transforming, this means that we need a linear constraint on 
the momentum of the form +(p) = yp. On the other hand this must be 
consistent with the known quadratic constraint 9(p, p) = mc’, or else we 
should be able to eliminate p° between the equations and get unphysical 
- constraints involving only the spatial components of p. By rescaling -y we 
may as well assume that 4 = mc and then the equations 


+(p)? = m2c? = g(p,p) (17.14) 


tell us that for consistency we need y(p)? = g(p,p) for all pe M. 


Proposition 17.4.1. The following conditions are equivalent: 
(a) (p)? = 9(p,p), for all p € M; 
(b) y(p)7(4) + (9) (Pp) = 29(p, g), for all p,q € M. 


Proof. The first condition clearly follows from the second by putting 
p=q. Conversely, by the linearity of y we have 7(p + ¢) = 7(p) +7(q), 80 
that 


v(p)¥(q) + Y(a)1(p) = y(p + 4)? — yp)? - (4)? 


= 9(p+4,p +4) — 9(p,p) — 9(4,9) = 29(p, 4). 
(17.15) 
o 


go 4 
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Corollary 17.4.2. There are no solutions of these conditions for 


which +(p) and (q) commute for all p and g. 


Proof. If there were commuting solutions to these equations, then we 
should have g(p,¢) = y(p)7(q), and 


9(p, 9)” = y(n)? ¥(4)" = 9(p,p)9(9, 9); (17.16) 
are orthogonal unit spatial vectors such as (0,i) and (0, j). im 
Fortunately, there are non-commuting matrices y(p) that satisfy this 
condition and, moreover, they are essentially unique. 
Theorem 17.4.3. In 2 x 2 block form, the matrices 


1e)= (2 2?) 


for all p and g in M. This equation is, however, violated whenever pand q 
| o.p ~Po 


satisfy the equation +(p)? = g(p, p) for all p € M, where the compo- 
nents 01, 02, 03 of o are the Pauli spin matrices. 


Proof. This follows by direct calculation, using the relation (o.p)? = 


|p|, since 
2 2 
2_ (Po- op 0 
(Pp) ( 0 pe — a, 
a « — |p/? 0 ) 
0 pé — |pl? 
1 0 
as required. a) 


These matrices are known as the Dirac matrices. In fact we shall show 
later, in Section 17.8, that any operators (p) depending linearly on p € M 
and satisfying this constraint are direct sums of operators equivalent to 
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these. This is therefore the smallest set of matrices that can be used and 
all other 4 x 4 matrices satisfying the constraint are equivalent to these. 

It is often useful to work in coordinates. We may choose an orthogonal 
basis e,, 4 = 0,1,2,3, for M such that g(eo,e9) = 1 and g(ej,e;) = —1 
for j = 1,2,3 and write y, = y(e,), so that 7(p) = y,p". With the above 
choice of matrices we have 


Proposition 17.4.4. The operators +, satisfy yf = 1, y? = —1, for 
j=1, 2,3, and 


Wwe + Wwe = 9, 
for p Xv. 


Proof. By definition of y, we have 
Ye + Yee = g(x, ev) (17.19) 
and the result follows on giving the values of g(e,., e,). oO 


We shall refer to the identities in this proposition and its earlier coordi- 
nate-free versions as the anticommutation relations for the - matrices. 


Definition 17.4.1. The four-dimensional space on which the 7 ma- 
trices act is called the space of Dirac spinors. 


This suggests that the appropriate relativistic equation for which we 
have been looking is, in momentum space form, 


(p)(Fv)(p) = me(FY)(p), (17.20) 


where 7 must now be a spinor-valued wave function, with four components. 
Recalling the formulae following equation (17.2) for the transforms of the 
momenta we now make the following definition: 
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Definition 17.4.2. In terms of the four-momentum operators 


0 F a) 
0 — jR#—— Ji) ieee 
P thao P. haa 


the free Dirac equation is given by 


y(P)p = mey, 


where the differential operator D = 7(P) = y,P¥ is called the Dirac 
operator. 


In practice the different signs attached to the space and time derivatives 
(which derived ultimately from the different signs in g) are often a nuisance, ai 
so it is useful to remove them by a redefinition of the 7 matrices. We simply = 
set y° = Y0, but 7? = —4;. (This is consistent with the conventions used =. 
in general relativity.) The Dirac operator may then be rewritten as im 


: ] ; 
D= thy" x =ihy’d,, (17.21) 
and the free Dirac equation becomes 
ihy"O,.0 = mew. (17.22) 


It is believed to be the equation appropriate to the description of an electron 
and also other light particles, such as the muon and tau particle, which seem 
to be distinguished from the electron only by their masses. 

The Dirac equation can easily be converted to a Schrédinger equation. 
In fact, using the convention that repeated Roman indices are summed over 
1, 2, and 3, we have 


Ow - O 
. ovr = er 1 es 
thiry are (me thy a) w, (17.23) 
or, on using the fact that 7°” = 1, 
iho, = cy (me thy Dal wy, (17.24) 


which inspires the following definition: 
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Definition 17.4.3. The operator 


: . ; 0 
Hp = me — cyo1sP? = C0 (me — thy” zi) 


is called the free Dirac Hamiltonian. 


We recall that y(p) was defined to ensure thet 


g(p, p)d(p) = y(p)*b(p) = mc), (17.25) 


where we have written Fy = o for the Fourier transform. This iieans that 
)(p) must vanish except when p is on the hyperboloid g(p,p) = mc". The 
inner product on Dirac spinors is given by 


~ ~  dp'dp*dp* 
me i. Joti ee (17.26) 


where { denotes the conjugate transpose of the Dirac spinor. If the wave 
function is concentrated in the region where the spatial momentum Ip| is 
small then |p| ~ mc and this closely approximates the non-relativistic 
formula for momentum space inner products. 


17.5. Antiparticles 


Given the block form of the y matrices, it is sensible to split each Dirac 
spinor into a pair of two-component vectors 


(x) = Gia (17.27) 
and similarly for its Fourier transform. Then the Dirac equation takes the 
explicit form i 7 

Po eg (3 ) = me (4 ) , (17.28) 
o.p ~Po pe pe 


or, equivalently, as a pair of coupled two-component equations 


Ge mo)ib ~ op (17.29) 
(p° + me)p- = o.Pyy. 
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We expect p® to be positive, so that p® 4 —mc. Then the second equation 
defines w_. as 

~ o.p is 

y= cme (17.30) 
and the consistency condition g(p, p) = m?c* ensures that this also satisfies 
the first equation. Thus the problem is completely solved once one knows 
the two-component wave function 74. Formally, ¥(p) appears in the inverse 
Fourier transform for (x) in the combination (2) = exp (—ig(z, p)) dp), 
which has the form of a plane wave, so these are often called plane wave 
solutions. 

Normally we should expect the spatial momenta to be small and p® ~ 
mec, so that the %_ components would be small with respect to those of 
w—, but there is no mathematical reason why this should not be reversed. 
The problem is that the consistency condition g(p,p) = m?c? is satisfied 
whenever 


p? = (mc? + |p|?)?, (17.31) 


so that the energy could be positive or negative. In principle this is inherent 
in the relativistic relationship between energy and momentum, but without 
quantum theory one could argue that energy can only be lost continuously, 
making it impossible to jump the gap from the lowest positive energy, mc, 
to the highest negative energy, —mc. In quantum theory energy can be lost 
in discrete packets, and, since particles tend to lose energy, one would have 


_ expected all the particles to have settled into negative energy states long 


ago. Dirac realized, however, that since the equation describes spin } par- 
ticles, which are therefore fermions, this difficulty could be circumvented. 
The Pauli exclusion principle forbids multiple occupation of a state by two 
fermions, so that if all the negative energy states are already occupied then 
the remaining particles must have positive energies. Since one measures 
only energy differences this vast sea of negative energy particles could go 
unnoticed. 

In practice one would expect that from time to time a particle of en- 
ergy E < —mc* would absorb enough energy to give it positive energy 
and leave a vacancy or ‘hole’ in the sea of negative energy particles. The 
zero energy of such a hole would still exceed by —E > mc* the negative 
energy of the particle that was there before, and the hole could therefore 
be interpreted as a particle of rest mass m and positive energy —E. This is 
known as the antiparticle associated with the original particle. By boosting 
the previously unnoticed negative energy particle to a positive energy and 
leaving behind the hole, interpreted as an antiparticle, the original energy 
would therefore seem to have simultaneously created a particle and an an- 
tiparticle. Conversely the positive energy particle might drop back into the 
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hole releasing the excess energy as it does so, and this would look like the 
particle and antiparticle annihilating each other in a burst of energy. We 
shall see, when we consider the Dirac equation in an electromagnetic field 
in Section 18.3, that these antiparticles must have the opposite charge to 
the original particles. A couple of years after Dirac published his equation 
Anderson discovered a positively charged particle otherwise identical to the 
electron. This particle, now called the positron, is the antiparticle to the 
electron. It is now thought that to each kind of particle (whether or not 
it is described by the Dirac equation), there corresponds an antiparticle, 
though in some cases such as the photon the particle may be its own an- 
tiparticle. Towards the end of 1995 a team at CERN in Geneva succeeded 
in constructing anti-atoms from antiprotons and positrons. So far these 
have been very short lived, but the hope is that new techniques will give 
them a long enough lifetime to start checking whether, for example, gravity 
affects antimatter in the same way as ordinary matter. 


17.6. The Weyl equation 


When the mass m in the Dirac equation vanishes, the equations take a 


simpler form z » 
pps, = o.py_ (17.32) 
pp_ = opp. 


On setting Dr a D4 ~ p- and tr = on + o-, the equations decouple to 
give two two-component equations 

(p+ op) vn = 0 (17.33) 
Cs ' 
(p° — o.p) dr ='0. 


It is, therefore, open to us to choose solutions in which one of the component 
two-vectors vanishes. 


Definition 17.6.1. The equation (p° — o.p) dr = 0 is known as the 


Weyl equation. 


This equation has been widely used to describe neutrinos, massless spin 
3 particles, predicted on theoretical grounds by Pauli and Fermi long be- 
fore the first was detected experimentally in 1956 by Fred Reines (who was 
awarded a share of the 1995 Nobel Prize for his discovery) and Cowan. (It 
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is now known that there are distinct neutrinos associated with the elec- 
tron, the muon, and the tau particle.) This equation was initially scorned 
by physicists, since it is not invariant under the parity operator. (The 
parity operator reverses all spatial directions and _so interchanges solutions 
of (p° ~ o.p) wr = 0 with those of (p°+o.p) p, = 0. Having an equa- 
tion with vanishing pr, but no corresponding equation with vanishing wz, 
implies that nature is not invariant under reflection. The suffix L on w 
refers to the fact that it is by convention a left-handed particle, and its 
right-handed equivalent does not appear in nature.) ‘God could not be 
only weakly left-handed’ was Pauli’s reaction, referring to the role of the 
neutrino in the so-called weak interactions. The neutrino itself was later 
discovered and when, in 1957, experiments of Wu confirmed the suggestion 
of Lee and Yang that parity violation could occur in weak interactions, 
Weyl’s equation was rehabilitated. Doubts of a different sort have arisen 
more recently, with suggestions that neutrinos might after all have a small 
mass, and that the three experimentally observed neutrinos might each be 
superpositions of neutrinos with different masses. There is, as yet, no firm 
experimental evidence for this, though it might help to reconcile astrophys- 
ical models of the interior of the sun with the paucity of solar neutrinos 
detected on the earth. In that case Weyl’s equation would just be an ap- 
proximation to the correct equation. 


17.7. The angular momentum 
The free Dirac Hamiltonian can be used exactly as in non-relativistic quan- 


tum mechanics to calculate energy levels and find constants of the motion. 
It is natural to start by considering angular momentum. 


Theorem 17.7.1. The Dirac Hamiltonian, Hp, commutes with L + 
440, where 


but not with the orbital angular momentum L. 


Proof. We shall start by calculating the commutator of ZL, with Hp. For 
this we notice that all the 7 matrices have constant entries and so commute 
with L;. We know the commutators between Ly, and PI = —ih@, from 
Proposition 8.1.1, so that we have 


[Lx, Hp] = —e7973{Lx, P?] = -ichy yng P', (17.34) 
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which shows that Hp does not commute with the orbital angular momen- 
tum operators. 

On the other hand when we calculate the commutators of the Pauli spin 
matrices with Hp it is the y matrices that give problems. We first note 
that from the explicit formulae 


_f1 0 0 -7\ (0 om 35 
You = € *) @ 0 ) > ‘ 0 ) : (17. ) 
This means that 


0 {ox o1] 1 

ai ~ 2 ty ? 

[F%, Hp] = [G,me"y — cyonP'] =e ( ee ee P', (17.36) 
The commutation relations for the Pauli spin matrices enable us to reduce 
this to 


Lice nt; (3 6) P! = —2icens yor P!. (17.37) 


Multiplying this by zh, adding it to the previous commutator, and using 
the antisymmetry of €,51, gives the result. It was not actually necessary to 
use the explicit matrix form of the 7 matrices: it is perfectly possible to de- 
duce the result directly from the anticommutation relations in Proposition 
17.4.4. In that case the conserved quantity is 
Ly + Peat ae 

This result shows that one must add a spin term, two copies of sho, 
to the orbital angular momentum before one obtains an expression that 
commutes with Hp and so is a constant of the motion. If we wish to retain 
the law of conservation of angular momentum then we are more or less 
forced to assign to the electron an intrinsic dngular momentum or spin of 
sho. This confirms the idea already suggested by the presence of the Pauli 
spin matrices that the Dirac equation describes a particle with spin 3. 

We conclude this section with another elementary property of the free 
Dirac equation. Knowing the Hamiltonian, it is easy to calculate the ve- 
locity of a Dirac particle. 


Proposition 17.7.2. In the Heisenberg picture the velocity of a 
particle described by the Dirac equation is 


dx* k 
oes =cyoy - 


af 
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Proof. Since the position operators X* commute with the Y matrices we 
have 


dxk = ; F 
ae 7X", Hp] = —er0 7g [X*, P)] = —ichiyoryj64 = —ich yore, 


th 
which is equivalent to the stated result. gO 


The first surprising thing about this result is that the components of the 
velocity do not commute with each other and so cannot be simultaneously 
measured. Furthermore, we have 


212 72 
(yor?)” = —y0?77" = 1, (17.38) 


so that the only eigenvalues of a velocity component are te. Physically this 
can be understood as a result of the fact that velocity is calculated from 
successive precise position measurements, which cause a large uncertainty 
in the momentum. Relativistically a large spatial momentum implies a 
velocity close to that of light. In practice one could think of the particle 
continually changing direction to give a much lower net velocity. One can 
also calculate the motion of the particle (Exercise 17.7). This picture of 
the motion is sometimes called Zitterbewegung or trembling motion. 


17.8. Uniqueness of the gamma matrices 


We shall now prove our earlier assertion that the Y matrices in the chiral 
representation are unique up to equivalence. 


Theorem 17.8.1. Let y(p) be operators on a finite-dimensional 
complex vector space V, which depend linearly on p and satisfy 
y(p)? = g(p,p) for all p € M. Then V is a direct sum of four- 
dimensional subspaces in which, for a suitable choice of basis, y(p) 


takes the matrix form 


Y(p) = @ a 


Proof. We shall work with the operators Yu = yen). Using the anti- 
commutation relations of Proposition 17.4.4, it is easy to check that Yo and 
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y1’‘Y2 commute with each other. We may therefore find common eigenvec- 
tors for both. Let us write , for one of these, so that for suitable complex 
numbers, a and £, we have 


yoQ1 = AN, and 417221 = BN. (17.39) 


. . 2 
Since 7% = 1 we see that a = +1, and similarly the identity (y1:y2)" = 
—772 = —1 means that 6 = +i. We readily check that 


yo ¥sQi = —y3°¥0N1 = —a'731, (17.40) 
and that 

11727321 = 73717201 = B31, (17.41) 
so that N3 = —7391 is also a common eigenvector with eigenvalues —a and 


B. Similarly, Q2 = 37121 and N4 = 711 are common eigenvectors with 
eigenvalues a and —£, and —a and —, respectively. Since all four possible 
signs occur, we may permute the four vectors to ensure that a = 1 and 
B = —i. On the subspace with basis 21, Q2, N3, and Q4 the operators yp 
and -y1y2 have the diagonal block matrix form 


1 0 _ ~t03 0 42 
Yo = (j 2) W172 a ( 0 a - (17. ) 
Moreover, the vectors 2; have been defined in such a way that 
ila = eae 17.43 
ws e 0 ; n=(Q 0 ) ( ) 


From these we may check that yz = —71(7142) also has the desired form. 
The vectors 2, for j = 1 to 4 span an invariant subspace. If this is the whole 
space then we are finished. Otherwise we pick another linearly independent 
eigenvector 4 and repeat the process. Oo 


17.9. Lorentz covariance 


The y matrices that appear in the Dirac equation are unique up to equiva- 
lence, so that all observers should agree on the form of the Dirac equation. 
Indeed, if the coordinate systems of two observers are related by a Lorentz 
transformation A then, since that preserves g, we have 


(Ap)? = g(Ap, Ap) = g(p,p) (17.44) 
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so that p — (Ap) provides another possible choice of Matrices, and by 
uniqueness this must be equivalent to the previous choice so that there 
exist operators, [(A), such that 


| (Ap) = D(A)y(p) P(A). (17.45) 
(We could equally well use +I'(A), so there are actually two such operators 
for each A. In what follows we shall assume that one has been chosen 
arbitrarily.) We can now see how Dirac spinors transform under Lorentz 
transformations. 

Theorem 17.9.1. Let 
V(A)O(p) = P(A) P(A~!p), 
and let D be the Fourier-transformed Dirac operator 
(Dd)(p) = (p) 9). 
Then a Bs: 

: V(A)DV(A)-} = B, 

| and if on is a solution of the free Dirac equation, then so is V(A)p. 

{ Proof. First the definitions give us 


(V(A)BV(A)~*9)(p) = F(A) (DV (A)~19)(A-p) 
=T(A)y(A~*p)(V(A)-1) (Ap) 
= T(A)y(A7*p)F(A)“1($)(p) 
= 7(p)b(p) = (B¥)(p). (17.46) 


Since this identity can be rewritten in the form V(A)D = BV(A), when # 
satisfies the free Dirac equation we have 


BV (A)b = V(A)BO = V(A)med = meV (A)¢, _ (17.47) 


showing that V(A) also satisfies the equation. a) 
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17.10. Explicit Lorentz transforms for spinors 


In this section we shall obtain explicit formulae for the operator I'(A) which 
will also enable us to show why the Dirae equation describes a spin 3 
particle. 


Lemma 17.10.1. For any vectors p,q € M such that g(q,q) #0, we 
have : 
(q)v(P)1(4)~* = ¥(RaP); 


where 


9(4, P) 
Rap = 222*q— p. 
oP “99,9 


The transformation R, is in the Lorentz group, but not proper. 


Proof. Since 7(q)? = 9(q,q) we have (q)~* = 7(9)/9(4,4). On multi- 
plying the anticommutation relation 


a(a)y(p) + 1(p)-r(a) = 29(4, P) (17.48) 
on the right by 7(g)~? we obtain 
vayy(p)v(q)* + v(p) = 29(4,P)¥(9)/9(4, 9) (17.49) 


from which the result follows by rearrangement, and the linearity of +. 
One readily checks that 9(q, Rgp) = g(q,p) sd that the component in the q 
direction is unchanged, and for similar reasons, the orthogonal components 
are reversed. A straightforward calculation shows that R, is in the Lorentz 
group, though it is never proper (since the determinant is (-1)3), and it 
need not be orthochronous. Oo 


The transformation R, leaves the component of p in the direction ¢ 
unchanged, and reverses the components of p orthogonal to g. To illustrate 
this, we take the unit vector g = €9 to obtain 

Rqp = 29(€0, P)€o — P = Poco — P5e5- (17.50) 
If we take q = (0,q) with q a unit vector, then 9(g,¢) = —|q|? = -1, and 


Rap = 2(a.p)¢ — p = (po, 2(p.q)a — p)- (17.51) 


. 


EXPLIcIT LORENTZ TRANSFORMS FOR SPINORS 299 


The spatial component along q is unchanged, but orthogonal components 
are reversed, which is precisely the effect of a rotation through 7 about the 
axis q. 


Lemma 17.10.2. Let A be a proper Lorentz transformation that 
fixes two linearly independent vectors v; and v2, and take any non-null 
vector u orthogonal to v1 and vo. If Au = —u take any w orthogonal 


to all three vectors u, vi, and v2, and otherwise set w = (1 + Aju. 


Then A= RyRy. 


Proof. Since A fixes v; and preserves g we have 
g(Au, v3) = g(Au, Avs) = g(u, vj) = 0, (17.52) 
showing that Au, and also w = u-+ Au, are in the plane orthogonal to v, 
and v2. Let us first deal with the case when Au # —u, so that w = u+ Au, 
and : 
g(w,w) = g(u, u) + 29(u, Au) + g(Au, Au) 
= g(u, u) + 29(u, Au) + g(u, uv) 


= 2[9(u, u) + g(u, Au)] = 29(u, w). (17.53) 
This means that 
Big NE ada pe a (17.54) 
g(w, w) 


Since it is also obvious that R, fixes u, we deduce that Au = RyRy. 
Each of the two reflections R,, and R, reverses the direction of orthogonal 
vectors, so their product must fix both v; and vg. It therefore has the same 
action as A. We have already observed that the reflections R, and R, 
each have determinant —1, so their product is, like A, proper. It is easy 
to see that any proper Lorentz transformation that fixes three independent 
vectors must be the identity, and applying this to A~1RR,, we see that 
RyR, = A. The special case when Au = —w is also easily checked. QO 


Theorem 17.10.3. Every proper orthochronous Lorentz transfor- 
mation is the product of an even number of elements R, and the 


product of the g(q,¢q) over all g is 1. 


i. 
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Proof. It is shown in most books on special relativity that every proper 
orthochronous Lorentz transformation can be expressed as the product of 
two rotations and a standard Lorentz transformation (one that fixes two 
spatial axes). Any rotation fixes the time axis and a spatial axis, and so 
falls into the class of proper Lorentz transformations considered above, and 
so, by definition, do standard Lorentz transformations. QO 


Corollary 17.10.4. Suppose that the proper Lorentz transformation 
A= Ry, Ryu, -+-Ru,. Then P(A) = 7(u1)y(u2)... (ue) satisfies 


(Ap) = F(A)7(p)P'(A)~* 


for all p € M. 


Proof. We know that we may write A = Ry, Rq,...Rgq,, 80 that 


y(qa)y(ga) +» 1(Ie) (P)1(Gk)* --1(q1)7* = 1 Rar Rap --» Ray?) 
= y(Ap), (17.55) 


as we asserted. Qo 


One of the advantages of this method is that it not only confirms that 
I'(A) of this form exists but tells us how to find it. Consider, for example, a 
rotation R through @ about an axis n. Choosing a vector u perpendicular 


to €9 and n, and noting that 


Ru = cos6u + sin 6n x u, (17.56) 


we see that 


(1+ R)u = (1 + cos 6)u + sin én x u 
= 2cos(46) [cos(46) u+sin(36) n x uj. (17.57) 


The reflection R,, is unaffected by the normalization of w, and it is more 
convenient to define w as the unit vector cos(4@)u+sin($)n x u. We now 


note that 
ve) = (oy 0") (ou 8") 


= ( ete ‘ ) (17.58) 


—(o.w)(o.u) 


| 
| 
| 
| 
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Using Theorem 8.8.1 we have 


(o.u)(o.w) = u.w + io.u x w = cos $6 + isin $00.n = exp (+4i00.n) , 


(17.59) 
so that 


i . 
(R) = ~(u)y(w) os (eae) ees) ) ; (17.60) 


This shows immediately that the two component spinors 1, and w2 trans- 
form as spinors under the rotation group. 


Corollary 17.10.5. The infinitesimal generator of rotations about 
the axis n is given by 


no 0O 
aL +n (%) ay 


where L is the standard orbital angular momentum operator. 


Proof. The infinitesimal generator is obtained by differentiating the ac- 
tion of rotations about n through an angle ¢, 


in ZV(ME) z in ST (AYG(AM2). (17.61) 
For rotations we have 
_ [exp(—gito.n) G 
PAS ( 0 exp(—}ito.n) ) 1682) 
= dv(A) An(on) 0 
; LA(o.n 
a C . 3 tee) (17.63) 


The derivative of ~ just gives the momentum space version of the calcula- 
tion that yielded the orbital angular momentum, n.L, so putting the pieces 
together we obtain the desired result. QO 


Expressing the free Dirac Hamiltonian as 
Hp =c[P® + y0(mce — D)] (17.64) 


and using the fact that rotations leave P° and -yo unchanged and commute 
with D we obtain a second proof of the earlier result on the conservation 
of Ly + thor. 
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17.11. Historical remarks 


When Dirac first realized that his equation also predicted a positively 
charged version of the electron, it was natural to assume that it must be 
the proton, which was the only known positively charged particle at that 
time. Weyl gave strong grounds for insisting that the new particle must 
have the same mass as the electron, and so could not be the much heavier 
proton. In fact the proton is now regarded as a composite particle built 
out of quarks, for which the Dirac equation is not appropriate. 

The equations that we have discussed are by no means the only wave 
equations for relativistic particles. There are others covering higher spin 
particles. There is also a totally different way to approach relativistic 
quantum theory: if one requires a Lorentz-invariant theory then one could 
simply investigate the irreducible representations of the Lorentz group. In 
fact, one could take the larger group, called the Poincaré group, which 
includes the translations in M as well (that is, the maps that take m € M 
to m+a fora € M). The irreducible representations of the Poincaré group 
were classified by Wigner in 1939. One family of these is characterized 
by a positive real number m and a half integer s, interpreted as mass 
and spin, respectively. Another family has zero mass and a half-integer 
spin-like parameter called helicity. When the helicity is 1, the particle can 
be interpreted as a photon and the two helicity states as its polarization. 
Other representations could describe particles travelling faster than light, 
or zero-mass particles with a continuous spin parameter, but there is no 
experimental evidence for any of these. 


Exercises 


17.1° Show that the Klein-Gordon equation 


has normalizable separable solutions of the form #(t,r) = T(t)R(r) 
provided that T(t) has the form exp(—i#t/h) with the energy E < 
me?, 
17.2 The Dirac equation for a particle of rest mass m is written in the 
form 
thy"Oup = meyp, 


where 


o_ 1 0 j 0 of 
ra(M) (28 
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17.3 


and 01, 02, 03 are the usual Pauli spin matrices and 0, = 0/dx4, 
= 0, I, 2, 3. Show that the Dirac equation for a ‘particle of rest 
mass m has solutions of the form 


W(ct, r) = en *Bt/ ( f(r)v ) ; 


g(r)o.rv 


where f and g are real-valued functions of r and v € C? is a fixed 
non-zero column vector, provided that 


Deduce that for solutions to exist f must satisfy the differential equa- 
tion 


cn = (=S5") cn, ; 


ch? 
Hence or otherwise show that there are no normalizable solutions of 
this form unless |E| < mc”, and find the solutions in this case. 


A particle of rest mass m satisfies the free particle Dirac equation 


(7.P —me)p = 0, 


o_f1 0 ;_ { 0 oi 
y =(5 ave v= (9, ee 


and 01, 02, 73 are the usual Pauli spin matrices. Show that 


where 


b=XxP+ yxy 


commutes with the relativistic Hamiltonian, where y = (7, ¥7,7°). 
Give a physical interpretation of J and the fact that it commutes 
with the Hamiltonian. 

Show further that J.P commutes with the Hamiltonian, and find 
the eigenvalues of J.P when the particle is in a mutual eigenstate 
o ped with eigenvalue E and three-momentum with eigenvalue 

? 0, p. : 
[You may assume that yey’ + oy’ y# = 2gu, and that o x o = 2io 
where o = (01, 02,03).] 


pubahaative aC Men ie mtb A usb atic ails 
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17.4° The Dirac Hamiltonian for a free electron is given by 


H = c(a.p) + Bmce* 


(23). (65) 


and o = (0;,02,03) with components the Pauli spin matrices, and 
1 the unit 2 x 2 matrix. 
(i) Show that J = L + 4h&, where L = X x P and 


where 


commutes with the Hamiltonian, and comment briefly on this result. 
(ii) If AK = (G.L +h), prove that K? is a constant of the motion. 
(iii) Show that the Hamiltonian may be written 


H = cayp, + the|X|~1a,K + Bme* 


where p, = [X|~!(X.P — ih), and a, = [X|~1(«.X). 


17.5° The matrices o,(p) are defined in terms of the four-vector p = (po, P) 
and the Pauli spin matrices 01, o2, and o3 by 


o+(p) = pol to.p. 


Show that o,(p) is a self-adjoint matrix and that every self-adjoint 
2 x 2 matrix is of this form for some four-vector p. Show also that 


det(o+(p)) = 9(p, p) = 74 (p)o-(p). 
Hence or otherwise show that if A is a 2 x 2 matrix with unit deter- 


minant then 
Ao, (p)A* = o4(Ap) 


for some Lorentz transformation A, and that 
A®*o_(Ap)A = o_(p). 


The 4 x 4 matrices y(p) and S(A) are defined by 


=(4% %P) sa=(F an): 
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17.6° 


17.7 
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Show that if ¢(p) satisfies the Fourier-transformed Dirac equation 


(D¢)(p) = y(p)¢(p) = me¢(p) 


then so does S(A)¢(A7!p). 
' Show that 


0 1 
o@= (2, 4) 6-ror) 
also satisfies the Dirac equation and deduce that the equation has 
both positive and negative energy solutions. 


An electron of mass m satisfies the Dirac equation 


ine = (ca.p + Bmc?)y 


where 


Setting k = k + w x k show that to first order in w, 
kx = k.x, 
and also that 


(1 -— }iw.£) ak =o.k(1—}iw.x) where E = ‘ ) 


Deduce that if o(@) = (1 — 4iw.Z)y(z) then, to first order in w, 
also satisfies the Dirac equation. 


Show, using Theorem 17.7.2, that the acceleration, d?X* /dt?, of a 
Dirac particle in the Heisenberg picture satisfies the equation 
d2X* 2% dx* 2ic? 
2 Hp = —P*, 
ae Rh dt a 

Show that P* and Hp are constants of the motion, and, using an 
integrating factor to integrate the equation of motion, find a rela- 
tionship between the velocity and momentum. By integrating the 
equation a second time deduce that the position operator, X*, can 
be written in the form 


Xk +2P*Hp t+ 2 (V* —c?P*Hp-!) Hp} (ecttiows ~ 1) ; 


where X* and V* are the initial values of the position and velocity, 
respectively. 


18 Dirac particles in electromagnetic fields 


1... saw Dirac. He has now got a completely new system of equations for 
the electron which does the spin right In all cases and seems to be “the 
thing". His equations are first order, not second, differential equations! He 
told me something about them but | have not even succeeded In verifying 
that they are right for the hydrogen atom. 

CHARLES GALTON DARWIN, letter to Bohr, 26 December 1927 


18.1. Interacting Dirac particles 


So far we have considered only free particles, moving relativistically with- 
out any external forces.- However, since there are reasons to expect that 
electrons can be described by the Dirac equation, we wish to know how 
they are affected by an electromagnetic field. For this we use the fact, 
known from clessical Hamiltonian theory (Exercise 18.1), that the equa- 
tions of motion of a particle with charge e in an electromagnetic potential 
given by potentials ¢ and A can be obtained by replacing the energy E 
by & — ed and the momentum p by p— eA. (Some books use ~e giving 
different signs throughout.) Relativistically this amounts to replacing the 
four-momentum p by p— (e/c)®, where © = (¢, cA) is the electromagnetic 
four-potential of Definition 17.1.5. Applying the same substitution to the 
Dirac theory we are led to the following equation: 


Definition 18.1.1. The Dirac equation for a particle with charge e 
in an electromagnetic field with four-potential © is 


(P- <8) p = mov. 


As before this can be recast in Schrédinger form: 


ince = 7° [mc* — cy; (P? — eA?) b + ed] wv. (18.1) 


Remark 18.1.1. We noted earlier that the electromagnetic potential is 
not unique, and the four potential © = (¢, cA) could be replaced by 


# = (6- Bieta +x) = 8+ iPy, (18.2) 
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where P has components defined by Definition 17.4.2. 


Proposition 18.1.1. Let ~ be a solution of the Dirac equation 
Y(P — e&/c)p = mew. Then y, = exp(iex/h)p satisfies 


o(P ~ £8") by = mepy, 


where ©! = © + icPy/h. 


Proof. This follows from a direct calculation, since 


h 
= esx (P— £0) y = me (ef**/*y) (18.3) 


1(P~ Ee) (ey) = ty (P+ Sry Sa") y 


ms) 


The effect of a gauge transformation in the electromagnetic potential 
can therefore be absorbed by changing the argument of the wave function. 
This has no effect on the most important physical quantities, which depend 
only on moduli of components of , and nowadays one tends to regard this 
gauge freedom to change the phase of the wave function as fundamental. 
The electromagnetic field is then a consequence of this freedom rather than 
vice versa. 

One should not, however, assume that the potential is detectable only 
through the associated field. In 1959 Yakir Aharonov and David Bohm 
pointed out that the potential can introduce an observable phase factor 
even when the field vanishes, and this has subsequently been verified ex- 
perimentally. Normally, the condition that curl A vanishes is sufficient to 
guarantee that A = grady for some x, but if some portions of space are 
excluded this may fail. (Failures can occur whenever there is a closed curve 
that is not the boundary of a surface in the region.) Consider, for exam- 
ple, an electron moving in the region where |C x r| > @ with a potential 
A =Cxr/|C x r|?, where C is a constant. (Physically this could corre- 
spond to the potential due to a current in the direction C, in.a wire which 
is shielded to keep the electron away. The wire and shielding prevents one 
from putting a spanning surface across any closed curve that circumnav- 
igates the wire.) One may check that curl A vanishes and that, formally, 


, hg 1 
1 cone a dh eeu said p bia natn te 
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A = grad6/C where @ is a polar angle in the plane perpendicular to C. 

However, although the gradient is well defined, x = 6/C and the corre- 

sponding phase shift in wave functions, exp(i6/C), are not, because chang- 

ing @ by 2m introduces a phase factor exp(2ri/C). This Bohm—Aharonov 

phase is observable through interference effects. (This phase can also be 

regarded as a special case of the Berry phase of Definition 12.5.2.) 
Following the same procedure of writing 


a fe) (18.4) 


and using the explicit block form of the -y matrices given in Theorem 17.4.3, 
the Dirac equation reduces to the coupled equations 


Ov+ 


ih — egy. = me? + .c0.(P — eA) p_ ies 
ince= — egy. = —me*y_ + ca. (P — eA) p+. 


We can subtract the rest energy from the component #4. by a gauge 
transformation with ex = met, replacing p by ~ = exp(imct/A)y. (We 
could achieve the same effect for # using —x.) The equations are now 
replaced by 


H 


(m5 - 2) py =co.(P~eA) d-, 
ot (18.6) 
(5, + 2me* — <#) p_ =co.(P — eA) py. 


In most everyday situations the kinetic energy and electrostatic potential 
are small compared with the rest energy, mc?, so the second equation can 
be approximated by 


Ime. = co.(P — eA) py. (18.7) 


Solving for pe and substituting into the other equation then we obtain 


(a5 - <6) oy = = (o.(P —eA))’ 4. (18.8) 


In order to evaluate this we first note that the proof of Theorem 8.8.1 
depends only on the properties of spin matrices and nowhere assumes that 
the components commute. It must therefore be true that 


(o.(P — eA))? = (P — eA).(P — eA) + i0.(P — eA) x (P - eA). (18.9) 


Piet A A ORO IW ACE SEEEY BRT PEPE ETP EPP EYER COSTER SPOT UTR TRC ATER WYK CL YS SLES A NGA) AO APNE ROG AAS ESET ASHER IRIE RE MMP EE NH 
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The big difference is that the vector product of two identical vectors with 
non-commuting components need not vanish. One has, for example, 


(Po —eAp)(Ps—eAs)~(Ps—eAg)(P2—eAz) = [P2—eAn, P3—eAs3}. (18.10) 


The components of P commute amongst themselves, as do those of A, but 
we still have 


—eé ([P2, As] + [Aa, P3}) = -—teh (02A3 = 03A2); (18.11) 


which is the first component of —ieficurl A = —iehB, where B = cur! A is 
the magnetic field. We have therefore proved the following result: 


Corollary 18.1.2. For A and P as above and B = curl A, we have 


o.(P — eA)? = |P —eAl? + cio B. 


On substituting this result into the earlier equation for %4, and trans- 
ferring e¢ to the right-hand side, we obtain the following result: 


Corollary 18.1.3. When the kinetic and electromagnetic energies 
are small compared with mc”, the two-component wave function #4. 
approximately satisfies the equation 
ay P—eA|? eh 
net = (Po 


ey om + ROB +6) p+. 


This low energy approximation to the Dirac equation correctly gives the 
coupling (eh/2m)o.B between the magnetic field, B, and the spin, sho, of 
the electron. Earlier attempts to deal with the problem non-relativistically 
had given only half the correct expression. This correct prediction of the 
so-called gyramagnetic ratio was another triumph of the Dirac theory. 
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18.2. Conserved currents 


We want to define an analogue of the probability density and probabil- 
ity current which led to a conservation law in the non-relativistic case. 
Relativistically we might expect these to combine into a single probabil- 
ity current four-vector. To find probabilities, densities must be integrated 
over three-dimensional spatial slices.of Mf, and we shall specify that we are 
interested in those of the form {x € M : g(U,x) = k} for constant k. 


Definition 18.2.1. The probability four-current density, s, for the 
Dirac equation in the frame defined by the four-velocity U is the 
unique four-vector that satisfies 


g(s,p) = cpl y(U)y(p)b 


for all p € M, where pt denotes the conjugate transpose of 7. 


If we make the usual choice of U = (c,0), then this definition ensures 


that - 
8° = 9(s,€9) = ply"y = pty, (18.12) 
which is the sum of the probability densities for the individual components 


of wy. 


Theorem 18.2.1. The probability current satisfies the conservation 
law 


Proof. 


ne =a leet DA pions 
oe ce (viv) + a5 (vlan) |. Gee) 
Each term contributes two pieces from differentiating ~ and pt. Those 
coming from w can be written as 


ant? (9 seov + af gay) v= vtyPaP)y = vt? (me— £1(8)) ¥. 


Ox5 
(18.14) 


CHARGED ANTIPARTICLES 311 


Those for pt can be evaluated using the following identity, which follows 


immediately from the explicit matrix formulae (7°77)! = +9. Remem- 
bering that the dagger involves a conjugation, we have , 


ee © a ,a\ yt 
- inn? Ge - 5a) b= —(PoP)¥)! 
+ 
= — | (me = <4(6)) | (18.15) 
The # and yt contributions therefore cancel giving the result. Oo 


18.3. Charged antiparticles 


In Section 17.5 we discussed the negative energy solutions of the Dirac 
equation. There is an alternative way of dealing with antiparticles that 
simplifies the discussion of their electromagnetic properties. It relies on 
the following result: 


Lemma 18.3.1. For any complex four-vector R 


Py(R)y? = 7(R). 


Proof. All the 7? matrices except y are real, whilst that is purely imag- 
inary so that 


ary J for 7 #2 
a ta Jj 
) { —y for j= 2. eds) 
Now from the anticommutation relations on the +) we also have 
2052 ¥ for 7 #2 
oa sad { Sy tng. (18.17) 
so that 73 = yy, from which the result follows. o 


Definition 18.3.1. The operator K that sends a Dirac wave function 
py to ~. = 7*~ is called the charge conjugation operator and w, is 
called the charge-conjugated wave function. 
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The effect of conjugation is to reverse the sign of the energy (since, as 
for time reversal, Section 9.11, exp(—iEt) is changed to exp(iH#t)). The 
operator 7? interchanges the two pairs of components of y. 


Theorem 18.3.2. The charge conjugation operator is conjugate lin- 
ear, that is K(a¢ + 6p) = ad + By, and satisfies 


dl yrbe = ply. 


Proof. The conjugate linearity is easily checked, and 
bhve = (178)! ve 
= gt tt yoy2y 
=: pty = ply. (18.18) 


This means that probability densities are unchanged. 


Theorem 18.3.3. Let w satisfy the Dirac equation y(P — e®/c)y = 
mew for mess m and charge e. Then %, satisfies the Dirac equation 


y(P + e®/c)- = Metbe 


for mass m and charge —e. 


Proof. Conjugating the Dirac equation for ~ with the help of Lemma 
18.3.1, we have pee Ka re = 

ay(P — eb/c)y?h = mew. (18.19) 
Now e® is real but, thanks to its factor of if, P is imaginary, so multiplying 
by 7? we obtain 


o(P + <0). = mey"b = more. is) 


We may interpret 7, as the wave function appropriate for the antipar- 
ticle; it describes a particle with the same mass but opposite charge. 
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18.4. The Dirac equation for central electrostatic forces 


The following result shows that, as one might expect, the spin and orbital 
angular momentum are either added or subtracted depending on whether 
they are aligned or opposed. 


Lemma 18.4.1. If the orbital angular momentum is jf then the 
eigenvalues of |L + 3foj? are (j +4)h and o-L takes the values jh 


or —(j7 + 1h. 


Proof. According to the Clebsch-Gordan formula, Example 16.7.2, we 
know that when the angular momenta j and 3 are combined we get either 
j+4 or j—4, so that 

IL + hol? = (G44) (+143). (18.20) 
We therefore have 

ho.L = |L + $hol? — Ll)? — 4h? |o/? 
= [G #2) G+143)-sG+1)- 9]? 
= [2 G+4) 8)" (1821) 


as asserted. a 


The operator o.L that appears in this result is useful because it is related 
closely to the expressions appearing in the Dirac equation and the following 
result will be of crucial importance when we consider central force problems. 


Lemma 18.4.2. The position, momentum, and angular momentum 
operators satisfy the identities 


o.P = |X|~?(o0.X) (X.P + io.L), 


(o.X) (o.L + fi) + (o.L +h) (o.X) =0. 
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Proof. The formula for products of spin matrices gives us 
(o.X)(o.P) = X.P + i(0.X) x (o.P) = X.P+io.L. (18.22) 


The first identity follows on multiplying this equation on the Jeft by o.X. 
The second identity follows on multiplying the commutation relation of 
Proposition 8.1.1(i) by ojo% = 26;, — o,0j, and simplifying. a) 


We shall now show how to solve central force problems when there is no 
magnetic field. We shall therefore set A = 0 and assume that the state is 
stationary so that iO /Ot = Ey. Then the Dirac equation can be written 
in the coupled form 


(E ~ mc? — ed) py. = co. Py. 


(E + mc? — ed)p_ = co.Pyx. ie 


Using Lemma 18.4.2 we can express this in terms of the angular mo- 
mentum. Thus we may rewrite the equations as 


(E — me? — ed) by. = e|X|7?(0.X) (K.P + io.L) p_ 
= ¢|X|-? (K.P —i(o.L+2h))(o.X)v_ (18.24) 
(E + me? — ed) p_ = e[X|~*(0.X) (X.P + io.L) oy. 


The second equation can be rewritten as 
(E + mc? — e¢)(o.X)p.. = c(X.P + io.L) yy. (18.25) 


This suggests the substitution ¥_ = o.X/|X|_, whilst leaving U, = 
4. Using radial coordinates, where, as usual, X.P = —ihird/dr, the two 
equations become 


(E — mc? — eb)b4 = ¢(-ihd/dr — ir~) (o.L + 2h)) U_ 


(18.26) 

(E + me? — ef) ¥_ = c(-thd/dr +ir-'o.L) Gy. 
We also know from Lemma. 18.4.1 that o.L takes the values jf or —(j+1)h, 
and we may as well concentrate on the former. By using the Pauli spin 
matrices, the equations for ¥ 3. can be combined into a single equation for 
the two-dimensional vector, €, formed from the jh eigenvector components 
of Vy and U_: 


(E — ef — me*o3)€ = —ich [(d/dr + 7") +771 (9 + Los] o1€ 
= ch [r7* (7 + L)o2 — 4(d/dr + r—*) oy] €.(18.27) 
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Theorem 18.4.3. With the above conventions, the Dirac equation 
for motion in a central electrostatic force field can be written in the 
form 


(E — ed — me?o3)é = —ich I($ + is 
dr 


This equation can be solved by the usual techniques. One can first look 
for asymptotic solutions, or, guided by the non-relativistic case, immedi- 
ately guess the appropriate substitution, € = exp(—«r)n. The equation for 
7 reduces to 


: d i | j 
(E — ef — me*a3)n = —ich I(# + a= *) o1 +i a ~ oa] mn, (18.28) 


r 


or, equivalently, to 


(E — me?o3 — icheo1)n = —ich (4 + *) a1 +4 G+ Da + =] n. 


dr T ch 
(18.29) 
It will be useful to introduce the notation 
R= i(ch)"!(E — mc*o3 — ichko,) (18.30) 


for this matrix, which appears often in the calculations. At this point we 
shall also specialize to the case e¢ = cha/r. (The fine structure constant 
@ is around 1/137 for hydrogen, and smaller than 1 for all stable atomic 
nuclei.) With these conventions, our equation becomes 


d 1 G+)o2ta 


To obtain asymptotic solutions with 7 almost constant for large r we need 
Rn = 0, and this can have non-trivial solutions only if det(R) vanishes, 
that is 


det(E — me?o3 — ichkog) = 0, (18.32) 


which forces & to satisfy the equation 
CR? = mct — F?. (18.33) 


We must pick the positive root to get a normalizable solution. 
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To find 7 for smaller values of r we try the series solution 
n= >or*t ax, (18.34) 


where the coefficients a, are two-dimensional column vectors. When this 
is substituted into (18.31), we obtain the equation 


yo rk*? Rag = ys [(kK+6+1)o, +i(j + l)oo + ia] rkt+§-1g,. (18.35) 
The indicial equation is 
(5 + 1)o1 + i(j + 1)02 + ia] ao, (18.36) 
which means that, for a non-trivial solution, 
det ((6 + 1)o1 + (7 + 1)o2 + ia] = 0, (18.37) 


or 
(b+ 1)? =(7+1)? -—a?. (18.38) 


For normalizability we must choose the positive root for 6 +1. (Recalling 
that a is much smaller than 1, we see that there is no problem in taking 
the square root, and in fact 6 ~ j.) We can then study the recurrence 
relations for the coefficients, which as usual lead to the requirement that 
the series should terminate. If ay is the last non-zero coefficient then, for 
consistency, we require that 

Ray = 0, (18.39) 


pr a re a NY A A RTE 


and that 
Ran-1 = [((n+ 6+ lor + i(j + 1)o2 + talan. (18.40) 


Now, since R has trace 2iE/ch, and vanishing determinant, its character- 
istic equation i8 
R(R — 2%E/ch) = 0. (18.41) 
Multiplying (18.40) by R — 2iB/ch, the left-hand side vanishes to give 
0 = (R—- 2%E/ch) ((n+6+1)o1+ ij + 1)oe + ialan. (18.42) 
Now, exploiting the properties of the Pauli spin matrices, we have 


(R = UE/ch)oe = ~—o2R, (18.43) 


and 
(R-2E/ch)o, = ~0,R + 2x. (18.44) 
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On substituting these relations and using (18.40) we arrive at 
0 = [2K(n +6+1) + 2aE/chlan. , (18.45) 
This gives us the quantization condition that 
. aE = —ch«(n +6 +1), (18.46) 


which with the earlier equations for « and 6 determines the energy levels. 
On substituting (18.46) into (18.33) we arrive at the following result: 


Theorem 18.4.4. The energy levels of the relativistic hydrogen-like 
atom are given by 


me? 


J/1 +4 [02/(n +6 + 1)7]’ 


where 6 is determined by (18.38). 


Since a < 1 we may approximate this answer by the first terms in the 
binomial expansion to get 
2_ __mero? 
2(n+6 +1)?" 
For hydrogen-like atoms one has cha = Ze?/4meg, which gives the approx- 
imation 


E~me (18.47) 


Ze 1 
A4néga 2(n +6 +1)?’ 
where a is the usual Bohr radius. 

Unlike its non-relativistic analogue, the relativistic energy does depend, 
through 6, on the angular momentum. This was another triumph of the 
Dirac theory, since it was known experimentally that the spectral lines were 
split into groups according to the angular momentum, and the relativistic 
formula correctly gives this fine structure. 


Ew me — (18.48) 


18.5. The successes of the Dirac theory 


e It correctly incorporates spin into the relativistic wave equation. 
e It gives the correct coupling to the magnetic field. 

e It correctly predicted the existence of antiparticles. 

e It gives the fine structure in the spectrum of hydrogen-like atoms. 
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Exercises 


18.1° The classical Hamiltonian for a charged particle in an electromag- 
netic field is 


h= lp —eAl? + ed. 


Show, using Hamilton’s classical equations of motion, that 


mx = e(E+xxB), 


where B = curl A and E = — (0A/0¢ + grad ¢). 
(Hint: grad (${U|?) = U x (curl U) + U- VU 


18.2° The Dirac equation for a particle of rest mass ™ and charge e moving 
in an electromagnetic field with four-potential a = (¢,-—cA) is 


[> (ina, - <a,) — me] = 0, 


where 


1 0 ,_{ 0 a er 
r=(4 a = (2, De ] 1, 2,3, 


and 1, 02, 03 are the usual Pauli spin matrices and 0, = 0/dz", 
p=0, 1, 2, 3. Show that the equation can be rewritten in the form 


ing = Hoy, 


where Hp has the form 
Hp = ca.(P — eA) + Bmc? + ed. 


Show that, when a vanishes, Hp commutes with L + dho’ , where L 
is the orbital angular momentum and 


,_f{o 0 
T=(9 g): 
Give a brief physical interpretation of this result. 
Using the Heisenberg picture calculate the velocity dX1/dt and 


show that 


{ESR ean SAA pe ee ta en or ae 


fe 
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18.3° The Dirac equation for a particle mass m and charge e moving in 


the electromagnetic field with four-potential © is 
[y-(p — e&) — me] = 0, 


where p is the four-momentum operator and 


o_/f1 0 _{ 0 « 


The particles are constrained to move in the x3 direction only, in the 
four-potential 


a= {5¥le fal<a 


0 [x3] > a, A=0, 


where V is a positive constant and a > 0. Write down the differential 
equations satisfied by ~ in the regions |x3| < a and |z3| > a. What 
are the connection conditions at 3 = ta? When mc > Po > me— 
V > O let w be given by 


i 
e~ *Pozo/h [ere ( me ) 4 e7tPas/h ( 4 )] 
Por V+me73¥ PoiVemel3U 


when |z3| <a, and by 


e7*Poto/h o—Kas/h ( ue ) 
Po-tme 73 


when 23 > a, and v, v’, and w are constant 2 x 1 matrices. Show 
that w defines a solution in x3 > ~a if 


(Po + V)? — P? = m2c?, Po? + K? = m2? 
and v, v’, and w satisfy the equations 
efPAlliy 4 e-tPa/hy! —. ¢—Ka/hy, 
eiPa/h,, _ --iPajnyy _ ,-K(Po+V + me) 
. P(Po + mc) 
Write down the solution in +3 < —a and deduce that the bound state 


energies are given by 
bans 2Pa\ 2a 
ht)” 1-0?’ 


where a = |K|(Po + V + me)/P(Py + mc) and E = cP. 


ew Ka/hyy 


19* Symmetries of elementary particles 


if, as | have reason to belleve, | have disintegrated the nucleus of the 
atom, this Is of greater significance than the war. 


ERNEST RUTHERFORD, apologizing for absence from a meeting 
of the International Anti-submarine Warfare Committee, June 1919. 


19.1. The structure of matter 


It will have become apparent over the last few chapters that the electrons, 
protons, and neutrons, which are the most prominent constituents of mat- 
ter, are only some of the many particles that are now known. There are also 
neutrinos, which appear in some nuclear reactions, Yukawa’s mesons, which 
help bind the nucleus together, and then to every particle a corresponding 
antiparticle. With improved techniques for detecting particles in cosmic 
rays and the increasingly powerful particle accelerators built after the war 
more and more different kinds of subatomic particle were discovered. ‘The 
various conservation laws found to hold during collisions between particles 
suggested that there must be some kind of symmetry principle linking dif- 
ferent particles. For example, protons and neutrons belong to a class of 
particles called baryons, and during collisions the total number of baryons 
minus the total number of antibaryons is constant. A similar conservation 
law held for the class of leptons which included the lighter particles such 
as electrons and neutrinos. Scattering experjments at higher energies be- 
gan to indicate that the proton has an internal structure suggesting that it 
might be composed of still smaller particles. 

The theory that evolved during the early 1960s suggests that baryons 
and mesons are composed of particles called quarks and their antiparticles, 
the antiquarks. There are several different kinds of quark but all have 
spin 5 and are fermions. Although they differ in such properties as their 
electric charge the quarks are remarkably similar. Already in the 1930s 
similarities between the proton and the neutron, whose masses (1836 and 
1839 times the electron mass, respectively) are almost identical, and which 
seem to differ in little more than charge, had led Heisenberg to suggest that 
they might simply be different states of the same particle. The apparent 
differences between them would then be no more significant than those 
between electrons with different spins. In the current theories this idea 
is applied to quarks, which are all regarded as being states of a single 
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particle. Rather than talking of Q different sorts of quark each with a two- 
component wave function (for the two different spin states), it is therefore 


more appropriate to talk of a 2Q-component wave function describing the 
state of a quark. 


Baryons are made of three fermionic quarks; the baryon states are de- 
scribed by A?C?°-valued wave functions. (Experiments show that quarks 
nestle in close proximity within protons, indicating that fermionic anti- 
symmetry affects the values of their wave functions rather than their spa- 
tial distribution.) Similarly mesons consist of a quark and an antiquark. 
The states of such a system lie in the space C22 @ O20* & L(C?®) (see 


Proposition 16.5.3(iv)), and, in fact, are always in the subspace of traceless 
operators. 


For simplicity let us start by ignoring spin and concentrating attention 
on the space C® of quarks with a given spin. The unitary transformations 
of this space cannot be precise symmetries (that is, the Hamiltonian cannot 
intertwine them) or it would be quite impossible to distinguish the differ- 
ent sorts of quark. It is now believed that the qualities of quarks are of 
two kinds, known as flavour and colour. (As in the modern food industry 
colour and flavour are supposed to be independent of each other.) Colour 
symmetry is exact, and quarks that differ only in their colour cannot be dis- 
tinguished. Flavour symmetry is only approximate, and different flavours 
of quark may have different masses and charges. In both cases, however, 
one is led to investigate the way in which the symmetry or approximate 
symmetry groups act on the states of the system, that is the representation 
theory of the groups of unitary operators. 


19.2. Characters of unitary groups 


Having explained the physical significance of the unitary groups we shall 
now investigate their properties mathematically. 


Definition 19.2.1. The unitary group U(n) is the group of all uni- 


tary n x n matrices U, that is matrices satisfying U*U = 1 = UU*. 


We shall start by recalling a well-known algebraic property of unitary 
matrices. 
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Theorem 19.2.1. Every matrix in U(n) is conjugate to a diagonal 


matrix. The entries on the diagonal are unique up to changes of order. 


Proof. This is a reinterpretation of Theorem A1.3.2 in the first appendix, 
according to which each unitary operator U on C” admits an orthonormal 
basis of eigenvectors, v1, V2,...,Un, such that Uv; = a;v;. Let V be the 
operator that maps the natural basis {e;} of C” to the eigenvector basis. 
Then V must be unitary and 


VOUVe; = VU; = ayV~ 19; = aes, (19.1) 


so that V~!UV, which is conjugate to U in U(n), has a diagonal matrix. 
Since the diagonal entries are the eigenvalues of U they are unique up to 
order. By reordering the basis elements any change of order is possible. 0 


We want to study finite-dimensional representations of a unitary group, 
G, that is homomorphisms of G into groups of unitary operators on inner 
product spaces. We saw in Section 9.7 that such representations can be 
described by their characters and that these are constant on conjugacy 
classes, which leads immediately to the following result. 


Theorem 19.2.2. Each character x of U(n) is uniquely determined 
by its restriction to the diagonal matrices. It can therefore be identi- 


fied with a function on T” which is invariant under permutations of 
its arguments. 


Proof. The character x is invariant under conjugation so that, in the 
notation of the previous proof, 


x(U) = x(V-'UV), (19.2) 


showing that x is determined by its value on diagonal matrices. The di- 
agonal entries a1,Q2,...,Qn, being eigenvalues of a unitary matrix, have 
modulus 1, that is they lie in T. Thus x(V~!UV) can be identified with 
a function of (a1, @2,-..,Qn) = a € T". Since the order of the a can be 
changed by conjugation x must be invariant under permutations. oO 
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Definition 19.2.2. We now introduce the elementary symmetric 
polynomials o;, 7 =1,2,...,n, as the coefficients of the polynomial 


lla +20;)=1 + Vatoe. 
j=l 


k=1 


Explicitly we have 


On = ; Oj, jg. HG, . 
Gi<jase<Gk 


The first few symmetric polynomials are o; = }) aj, 72 = D> 
On = A1Q2q...An. 


j<k Ok: 


Theorem 19.2.3. The characters of finite-dimensional representa- 
tions of U(n) can be regarded as polynomials in ¢1,09,...,0, and 


nn: 


Proof. The subgroup of diagonal matrices in U(n) is abelian and iso- 
morphic to T”, and so the restriction of any finite-dimensional representa- 
tion of U(n) will decompose into a direct sum of irreducibles. By equation 
(9.30) these are one dimensional and (a1, @2,..., Qn) acts as atk! ok hata otkn 
for some integers ki, ko,...,k,. The character x(U), being a sum of such 
monomials, is therefore a polynomial in the a and their inverses. If —k is 
the largest negative integer to appear as the exponent of any of the a then 
x(U)o,* can contain only positive powers of the a and so is a polynomial. 
However, it is a classical theorem of algebra that any polynomial in the a 
that is invariant under permutations can be expressed as a polynomial in 
the a. (This can be proved by induction on the number of variables and the 
degree of the polynomial.) Thus x(U)on* is a polynomial in o1,02,...,0n 
as required. ; Oo 


Remark 19.2.1. 
nr 
[[G@ + 2a;) = det(1 + 2V-4UV) = det[V-1(1 + U)V] = det(1 + 2U), 


g=l1 
(19.3) 
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so that the a; can easily be calculated directly from U; indeed they are 
closely related to the coefficients of the characteristic polynomial. 


We have not so far used the fact that the a have modulus 1. This means 
that 0,1 = Gq, and more generally that nk = OnTk- 

Often we are only interested in subgroups of U(n), rather than the whole 
group. 


Definition 19.2.3. The special unitary group is the subgroup of 
unitary matrices whose determinant is 1: 


SU(n) = {U € U(n) : det(U) = 1}. 


This is particularly important, since then 
On = det(V~!UV) = det(U) = 1, (19.4) 


which leads to the following result: 


Theorem 19.2.4. The characters of representations of SU(n) can 


be identified with elements of Clo1,02,...,0n—1]. 


Example 19.2.1. The characters of U(1) are described by elements of 
Clo1, 071]. Since T = U(1), this accords with Example 9.4.2, where it is 
shown that the irreducible representations are just given by integer powers 
of a] = 01. 


Example 19.2.2. The characters of SU(2) are described by elements of 
C[o;], that is polynomials in 01 = a + a2 = a1 + ay. 


Example 19.2.3. The characters of SU(3) are described by elements of 
Clo1, 02] = Cloi,%)- 
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19.3. Representations of unitary groups 


It is useful to preface our discussion of unitary groups with a general con- 
struction. 


Definition 19.3.1. Let U be a representation of a group G on a 
space H. The contragredient representation is defined on the dual 
vector space 7{* by 


(U*(x)f)(p) = f(U(x~")) 
for z in G, p in H, and f in H*. 


The importance of the contragredient representation lies in its applica- 
tion to antiparticles. If a symmetry group G for the particle has a represen- 
tation U on the space of particle states, then the contragredient represen- 
tation U* acts on the states of the antiparticle. When U and U* coincide 
then the particle is its own antiparticle. 

When 7 is a finite-dimensional space so that we can write its elements 
as column vectors and the dual vectors as row vectors, then using T to 
denote the transpose we have 


(U*() f) pv = fT (U(2~)v) = (UTS) v. (19.5) 
Then we can identify U*(x) with 
O(a)! =0G@)" = U(a), (19.6) 


since the adjoint is the complex conjugate of the transpose. A similar 
argument using dual bases shows that the character of the U* is the complex 
conjugate of the character of U. If the character is real, as for the adjoint 
representation, then the contragredient representation U* is equivalent to 
U. 

We shall now describe some of the representations of U(n), which cor- 


respond to the characters already mentioned. We start with one obvious 
example. 


Example 19.3.1. (The natural representation) One can also regard 
U(n) as the unitary operators on the space C” equipped with the usual 
inner product, so there is a natural representation in which the matrix U 
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is mapped to itself regarded as an operator on C". The character of this 
representation is 


trU = tr(V~1UV) = a; = 01. (19.7) 


This is the representation that describes the states of the quarks themselves. 

The tensor powers of this representation can also be formed as in Def- 
inition 16.7.1. According to Proposition 16.7.1 the k-th tensor power has 
character o,*. This means that all the representations of SU(2) can be 
constructed from tensor powers of the natural representation. The contra- 
gredient representation has character 7). Example 19.2.3 then tells us that 
all the representations of SU(3) can be constructed from tensor products 
of the natural representation and its contragredient. 

In practice the symmetric and exterior powers (Section 16.6) are more 
useful than general tensor powers. 


Example 19.3.2. (The exterior power A‘) The exterior power is 
represented on A*C®", which has basis ej, Aej, A... A €j,, with ji < jo < 

. < jx. This is an eigenvector for the diagonal matrix V-UV with 
eigenvalue 01,03, ...Q3,, 50 that the character is given by 


? Olj, jg + OG, = Th. (19.8) 
J1<JaSen Ste 


The baryons, which are made up of three quarks, are described by the 
representation of U(2Q) on A?C?@, with character 3, and the antibaryons 
are described by the contragredient representation whose character is 03. 

It is a corollary of Theorem 19.2.4 that all the representations of SU(n) 
can be constructed from the various exterior powers of the natura] repre- 
sentation. | 


Example 19.3.3. (The symmetric power S*) The characters S;, of 
the symmetric powers are not so easily expressed, but by taking bases it is 
not difficult to show that 


Sk(a) = >> abt ok? |. kn, (19.9) 
kit.tka=k 
Although the symmetric powers would seem irrelevant to a discussion of 
fermions, they do appear as subrepresentations when we take into account 
the fact that flavour symmetry is not precise, but we really only need the 
cases of k = 2 and 3. When k = 2 the formula reduces to 


S2(a) = Sa} + So ajax = 03-02. (19.10) 
j I<k 
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This confirms the decomposition of 2 C” into a symmetric and an anti- 
symmetric part (0? = S2 +02). Setting k = 3 we obtain 


S3(a) = >> 33 + ¥ OFA, + o10203 = oF - 20102 + 03. (19.11) 
; j g#k 


It follows from this identity that o? = S3 + 03 + 201, where we have in- 
troduced II = o,¢2 — a3. In particular, this shows that the third tensor 
power contains more than just its symmetric and antisymmetric part. The 
character IT is of interest in its own right. We shall not describe an explicit 
form of the corresponding representation in general, but in low dimensions 
it often coincides with one of those already discussed. For example, in the 
case of SU(2) we have og = 0 and o2 = 1, so that IT = oj. For SU(3), we 
have o3 = 1 and og = G7, so that II = jo,|? — 1. This is the character of 
the adjoint representation, which we shall describe in the next example. 


Example 19.3.4. (The adjoint representation) Mesons may be con- 
sidered as bound quark—antiquark states, and we noted that these could be 
represented on the space of traceless elements of £(C*). The corresponding 
representation of U(2Q) is realized on the space of n x n matrices A whose 
trace vanishes, that is 


Lo = {A € £(C") : tr(A) = 0}. (19.12) 
The space has the inner product 
(A|B) = tr(A*B), (19.13) 
and the adjoint representation on this space is defined by 
ad(U)A = UAU~! = UAU*. (19.14) 
To see that this is well defined we check that 
tr(ad(U)A) = tr(UAU~!) = tr(AU~1U) = tr(A) = 0. (19.15) 
Similarly we may show that ad(U) is unitary: 


(ad(U)Alad(U)B) = tr((UAU*)*(UBU")) = tr(UA*U*UBU*) 
= tr(ABU*U) 
= tr(A*B) = (A[B). (19.16) 


ebb a Sia ne Se ee | 
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When j # k the elementary matrix E;,,, with a 1 in the j-th row and k-th 
column and 0 elsewhere, has vanishing trace, and it is an eigenvector for the 
diagonal matrix V-'UV with eigenvalue aja. Together with the matrices 
Ej; — Enn which also have vanishing trace and are fixed by ad(V~!UV) 
these form an orthonormal basis of Ly. The character of this representation 
is therefore 

Xea(U) = D> asae + (n — 1) = for? - 1. (19.17) 
#k 

Proposition 16.7.1 provides a much easier way to calculate the character, 
for it tells us that the conjugation action on all n xn matrices is equivalent 
to the tensor product of the natural representation with its dual, and so has 
character 3101. But this is also the direct sum of the adjoint representation 
with the trivial representation on multiples of the identity, from which 
it follows that the adjoint representation must have character |o,|? — 1. 
As the character of the adjoint representation is real, it is equivalent to 
its contragredient. This means that the antiparticles of mesons are also 
mesons. 


Example 19.3.5. (‘The rotation group characters) In Corollary 9.9.3 
we noted that the rotation group characters AJ could be lifted to the special 
unitary group SU(2), where they make sense for half-integral j as well. 
These characters are linked to those for general unitary groups by the 
following relations: 


Lemma 19.3.1. The following relations hold between characters of 
SU(2) and those of the rotation group: 


i 


R*A? = S3. 


R*A = Xad: 


R*A° = 1, RrAt = Il, 


Proof. D® is the trivial one-dimensional representation, from which the 
first statement follows. We have already noted that for SU(2), II = o1, 
which has the character 

sint 


sin(Ht)’ (19.18) 


a1 + a2 = 2cos($t) = 


the character of R*Di. Similarly, the character of the adjoint representa- 
tion is 


jo? —1=fa,+ a2|? -1= 4cos*(4t) —1=2cost +1, (19.19) 


“a 
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which is the character of R*D!. Finally, we have for S3 


4 4 . 

ata sin(2t 
a? + afag + aa2 + 03 = 12 ee @ ) 
sin(5t) 


an (19.20) 


which is the character of R*D?. oO 


We shall henceforth find it useful to drop the R* and refer to repre- 
sentations of SU(2) as though they were representations of the rotation 
group. 


Example 19.3.6. (Power sums) Although every character is described 
by a polynomial in the eigenvalues a, a@2,... which is invariant under per- 
mutations, the converse is not true. The power sums, 
Sr =ay tag +...¢a,, (19.21) 

are clearly invariant under permutations, but apart from s, = 0), they are 
not themselves characters of representations. (This can be seen by checking 
that the character so = a? + 0% of SU(2) is lifted from the rotation group 
but cannot be expressed as a sum of any of the known characters given in 
Section 9.8.) In fact, we can easily check that 
82 = S2-— 02, (19.22) 


and in general, the permutation-invariant polynomials are formed from 
sums and differences of characters. 


19.4. Subrepresentations of unitary groups 


We are now ready to undertake a more detailed examination of quark sym- 
metries. Following the discussion in Section 19.1, the number of quarks, Q, 
is the product C’F of the number of colours, C’, and the number of flavours, 
F, This must be doubled to obtain the total number of quark states, since 
there are two possible spin states for each. The full symmetries of the theory 
are therefore given by U(2C'F), whilst the colour, flavour, and rotational 
symmetries are described by U(C), U(F), and SU(2), respectively. 

We have already noted the distinction between the exact colour symme- 
tries and approximate or broken flavour symmetries, which arises because 
the Hamiltonian is invariant under the colour symmetries, but not under 
the full group of flavour symmetries. This means that we need information 


330 SYMMETRIES OF ELEMENTARY PARTICLES PARTICLE MULTIPLETS 331 


about what happens when the relevant representations of U(2CF) are re- Proof. Since o1(I(a, B)) = o1(a)o1(8), we obtain 
stricted to the colour subgroup, SU(C), and it is useful to consider this 
problem in a broader context. Xea(I(a, B)) +1 = |oi(I(a, B))|? 
For any positive integers m and n there is a representation I of U(m) x = |o1(a)|*\o1(8)|? 
ber im oy + 
U(n) on C™ @ C” = C™ given by = [xaa(@) + 1)[xaa(8) + 1), (19.27) 
I(V,W)(¢ @ Y) = (Vd) @ (Wy), (19.23) 


whence we obtain the formula for xaq. 
for V € U(m), W € U(n). Since I(V,W) is in U(mn), any representa- We saw in Example 19.3.3 that 
tion U of U(mn) can be restricted to a representation I*U = UoT of 
U(m) x U(n). It is natural to ask how standard representations, such as 
the three-fold exterior and symmetric powers A? and S°, and the adjoint 
representation restrict to U(m) x U(n). 

With respect to the natural bases e1,€2,...,€m of C™ and fy, fa,..-. fa 
of C” the diagonal matrices a € T™ € U(m) and § € T" C U(n) act as 


I(a, B)(e; ® fr) = aye; ® Buf = 058. (e; ® fr), (19.24) 


so that I(a, A) is in the diagonal subgroup T™” of U(mn). This makes it 
easy to restrict characters. For example, the natural representation gives 


o1(I(a, B)) = >> 056 = (D- 05)(S_ Bx). (19.25) 
- 


ik j 


81(a)® = S3(a) + 03(a) + 211(a), 


and it is easy to check the following identities in a: 


83(a) = 5 a} = 93(a) + 03(a) — II(a) 
81(a)s2(a) = (} a3)(_ oF) = 53(a) — 03(a). 
Using these and the similar identities for s,(9) and s,(I(a,@)) together 
with the preceding observation that s,(I(a, G)) = 3,(a)s,(@), we obtain 
$3(I(a, B)) + e(I(a, B)) + 2(I(@, 8) . | 
= [S3(a) + o3(@) + 211(a)][S3(@) + o3(@) + 211(8)} 
S3(I(a, B)) + o3(I(a, B)) — W(I(a, B)) 
= [S3(a) + o3(a) — TI(@)][S3(8) + o3(6) - H(6)] 


(19.28) 


More generally we see that 
8-(I(@, B)) = 045" Be” 


ik S3(I(a, B)) — o3(I(a, B)) = [S3(a) — 3(a)][$3(8) — 73(8)]. 
= Qu a go> Bx") Taking the first equation of our triad plus twice the second gives 
= 8r(a)s-(). (19:28) S3(I(a, A) + o3(I(a, 8)) = [S3(a) + o3(a)][$3() + o3(4)] + 211(a)11(8), 
F Scere : Fi 2 19.29 
. es cata possible to study the restriction simply by working with the high with the third gives ( ) 
o3(I(a@, B)) = o3(a)53(@) + T(a)I(@) + S3(a)o3(8) 
Lemma 19.4.1. Let 3, 53, Il, and xaq denote the characters of 53(I(a, 8) = 53(a)53(A) + MI(a)I1(6) + o3(a)03(8). 
representations of unitary groups. Then al 


Xea(I(@, B)) = Xaa(%)Xad(B) + Xaa(@) + Xaa(F) 
o3(I(a, B)) = o3(a)S3(8) + W(a)II(8) + $3(a)o3(6) 
Sa(I(a, B)) = S3(a)$3(8) + II(@)T1(G) + o3(a)o3(8). 


19.5. Particle multiplets 


The group U(2Q) contains the subgroups SU(2) describing rotations of 
the spin 4 particles, the subgroup U(C) of colour symmetries, and the 
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subgroup U(F) of flavour symmetries. Since we do not need to allow for 
multiples of thé identity more than once, we shall now investigate how 
the representations of U(2FC) restrict to the product subgroup SU(C) x 
SU(2) x U(F)- 

Before doing this we need to incorporate two further pieces of physi- 
cal information, The first is that experiments indicate that C = 3. The 
second is that the colour symmetry seems to be so precise that only parti- 
cles transforming under the trivial representation of SU(C) = SU(3) seem 
to be observed in isolation. (At least if other representations do occur it 
is only at energies beyond the range of present-day experiments.) This 
is often expressed by saying that only colourless combinations of quarks 
are observable in isolation. (This means, in particular, that quarks them- 
selves, which transform under the natural representation of SU(C), are not 
observable in isolation.) 


Assumption 19.5.1. The number of colours. C, is three, and only 
the part, [U]o, of a representation, (that transforms trivially under 


SU(C), is relevar* for ::. -s.ors and mesons. 


In the following discussion the superscripts R, C' and F refer to charac- 
ters of SU(2), SU(C) and U(F). 


Theorem 19.5.1. ‘¢!.- : wts of the adjoint and exterior cube repre- 
sentations that tr. - trivially under the colour subgroup SU(C) 
decompose under sions and flavour transformations in the follow- 
ing way: 
joslo = Ai St + AMI; 
[xadlo = Al(xga + 1*) + Axe, 


where the rotational representations are given by Lemma 19.3.1. 


Proof. Applying Theorem 19.4.1 to SU(C) x U(2F) we obtain 


o3 = of SBF 4+ 1°? + sEo3* 


Ci .2F Cc QF (19.30) 
Xad = XadXad + Xad + Xad- 
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Since C = 3 we have of = 1, and so the parts that transform trivially 
under SU(C) are 


oslo = S3* 
Ixealo = x34 - 
Now we can apply Lemma 19.4.1 again to SU(2) x U(F) to obtain 


(19.31) 


SB” = SPS +07" + of03; 
2F RF R F (19.32) 
Xad = XadXad + Xad + Xea: 


Using Lemma 19.3.1 to identify the characters of the rotational SU(2) we 
obtain the results given. oO 


Corollary 19.5.2. The mesons and baryons fit into the following 
families: 

(i) F(F + 1)(F + 2)/6 spin 3 baryons transforming under the sym- 
metric cube representation of U(F) on S?CF; 

(ii) F(F? — 1)/3 spin } baryons transforming under the II represen- 
tation of U(F); 


(iii) F? — 1 spin 1 mesons transforming under the adjoint representa- 
tion of U(F) and a single spin 1 meson transforming trivially under 
U(F); 

(iv) F? —1 spin 0 mesons transforming under the adjoint representa- 
tion of U(F). 


Proof. The dimensions of S#, I, and the adjoint representations are 
F(F + 1)(F + 2)/6, F(F? —1)/3, and F? — 1, respectively, so that this 
result is just giving a physical interpretation of the previous theorem. O 


Remark 19.5.1. When this theory was first propounded it was believed 
that there were just three flavours, called up, down, and strangeness, giving 
3.4.5/6 = 10 spin 3 baryons and 3(3? — 1)/3 = 8 spin 4 baryons as well 
as 3? — 1 = 8 mesons of spin 0 and 3? = 9 mesons of spin 1. In 1974 a 
fourth flavour, called charm, was discovered, and then a fifth, called beauty 
(or bottom). It has been generally believed that flavours come in pairs, and 
in 1995 the discovery of quarks of the sixth flavour, called truth (or top), 
was announced and confirmed. With a mass around 188 times that of a 
proton, the top quark is some forty times more massive than its partner, 
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the bottom quark, giving a clear indication of the extent to which flavour 
symmetry is broken. One now expects that F = 6, giving 6.7.8/6 = 56 
spin 3 baryons and 6(6? — 1)/3 = 70 spin 2 1 baryons as well as 67 — 1 = 35 
mesons of spin 0 and 6? = 36 mesons of spin 1. 


19.6. Conserved quantities 


The evidence suggests that, although the Hamiltonian does not commute 
with the whole representation of U(5F'), it does commute with its diago- 
nal subgroup. Since this subgroup is abelian and isomorphic to T*, Ex- 
ample 9.4.4 tells us that its irreducible representations are of the form 
ook... okF, Their action commutes with the Hamiltonian, so the num- 
bers k,, kz2,...,f must be conserved quantities. These can be identified 
with the physical conserved quantities such as charge and baryon number 
as follows: 


Definition 19.6.1. An eigenvector for the diagonal subgroup T” C 
U(F) that corresponds to the representation ay? ak? ...a*F corre- 
sponds to a state having baryon number B = 3 yy, k;, charge Z = 


by kaj — B, strangeness S = —kg3, charm C = kg, "beauty —ks, and 
truth kg. 


Remark 19.6.1. By definition 4 57k; takes the value 1 for S3 and Il, and 
vanishes for the adjoint and trivial representations. Using the definition of 
B the charge can also be expressed as 


= 30 bai — 3 fas. 
7] i 
The non-trivial characters of the flavour group, U(F), that appear are 


53 = 235 + Soa? aR + os Lj Oe, O41 


I#k j<kel 


T= 5 afen +2 >. AjAKQY 


g#xk j<k<l 


Xed = aya -—1 = Sajaz! 1. 
ik ik 


The proton and neutron are spin 4 baryons and, according to Corollary 
19.5.2, are therefore described by the representation II. With our con- 
ventions their states correspond to the eigenvectors with eigenvalues a1a3 


(19.33) 


(19.34) 


alk 
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FIGURE 19.1. The spin 3 baryons corresponding to the representation [I, whose 
eigenvalues are shown on the right. The strangeness S is constant on the rows and 
takes the values 0, ~1, —2 from top to bottom. The charge Z is constant on each 
column and takes the values —1, 0, +1, from left to right of each pattern. 


and a?az respectively. Their charges are therefore (2 x 2 — 1)/3 = 1 and 
(2 x 1 ~ 2)/3 = 0, in agreement with the experimental evidence, and their 
other quantum numbers for strangeness, charm, truth, and beauty, all van- 
ish. Next consider baryons and mesons whose eigenvalues involve only ay, 
Q@2, and ag. Since ky + ko + ks = 3B is known, only the charge Z and 
strangeness § are needed to specify the state. The possible values can be 
plotted on a diagram in the (Z, S)-plane (Figures 19.1 and 19.2). 

When this theory was first introduced all but the Q- particle were 
known. It was the discovery of this baryon early in 1964 that convinced 
most physicists of the plausibility of the theory. 

Just as the adjoint representation of U(F) describes the mesons (Figure 
19.3) that are associated with the strong force, so the adjoint representation 
of SU(C), that is of SU(3), is believed to describe particles called gluons, 
which mediate the forces holding quarks together. There is therefore a 
diagram similar to Figure 19.3 showing the eight possible gluon states. 
However, since gluons have non-trivial colour transformations one would 
not expect to see them in isolation. 


19.7. Historical note 


As mentioned, Heisenberg’s isospin theory treated the proton and neutron 
as two states of a single particle called the nucleon, which amounted to look- 
ing at a theory with F' = 2. Since II = o, for SU(2), the nucleons them- 
selves played the role of quarks. In the early 1960s Murray Gell-Mann and 
Yuval Ne’eman independently proposed that a theory with F = 3 would 
be more appropriate, and the discovery of the 2- particle soon provided 
strong evidence for this. The characters of SU(3) all lie in C(o1, 77], so that 
all the representations can be derived from the natural representation and 
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= i) 

ao a3 


FicuRE 19.2. The spin 3 baryons corresponding to the coer Ss, see 
with the corresponding eigenvalues on the right. The strangeness 9 is constan' 
taking the values 0, —1, —2, and —3 from top to bottom. The 


along the rows, 
: ant on the columns, taking the values ~1, 0, +1, and +2 from 


charge Z is const 
left to right. 


Pa Kt 105 O23 
0 SE 1 a20;! 
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FIGURE 19.3. The spin 1 mesons corresponding to the adjoint representation of 
the flavour group, together with the corresponding eigenvalues on the right. The 
strangeness 5 is constant on the rows and takes the values +1, 0, and —1 from top 
to bottom. The charge Z is constant on each column and takes the values —1, 0, 


-+1, from left to right. 


its contragredient. That suggested that all particles might be composed of 
quarks transforming with the natural representation and antiquarks trans- 
forming with its contragredient. (The name quark was taken by Gell-Mann 
from a passage mentioning ‘three quarks’ in James Joyce s book beat 8 
Wake.) As already observed the discovery of charm in 1974 raised the value 
of F to 4, and subsequent developments have suggested a figure of F = 6. 
For these larger groups some characters, such as g2, cannot be given by 
functions of c, and Gj alone, but it is nonetheless true that all the particles 
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so far discovered can be regarded as combinations of quarks and antiquarks. 

Early attempts to link spin and flavour led to the realization that all 
the baryons fitted together in the symmetric cube S? of U(2F). However, 
since the spin 3 quarks ought to be fermions it was necessary to change 
this to A, and colour was introduced as a device for doing this, and so 
rescuing Pauli’s exclusion principle. However, the assumption that only 
colourless combinations are stable is much stronger than the Pauli principle 
itself. Indeed one can use the assumption to explain why quarks are only 
seen in threes or in combination with antiquarks, since these are the basic 
colourless combinations. 

This chapter has focused on the baryons, but there is strong evidence 
that these are closely linked to the leptons too. There are three known 
massive leptons, the electron, the muon, and the tau particle, each associ- 
ated with a very light or massless neutrino, and to each of these particles 
corresponds an antiparticle. (It has been amply confirmed that, although 
apparently massless, chargeless, and lacking in distinctive qualities, the e, 
4, and 7 neutrinos are really different from each other, and from their an- 
tiparticles.) There are also three generations of quarks, each consisting of a 
pair of quarks (up-down, strange-charmed, bottom-top) one of which has 
charge 2 and the other charge —}, and for each quark there is an antiquark. 
The three pairs of quarks seem to correspond in some, as yet imprecisely 
understood, way to the three pairs consisting of a massive lepton and its 
neutrino. (For example, the difference in charge between two quarks in a 
pair is 2 _ i = 1, and so is that between a neutrino and the corresponding 
massive lepton.) 

There have been various attempts to explain this, including some that 
interpreted both leptons and baryons as bound states of still more fun- 
damental, but as yet undiscovered, particles. But all have disadvantages 
and none has yet found widespread support. Whilst this chapter was be- 
ing written, some new experimental evidence emerged which suggests that 
quarks may themselves have internal structure. Whether that particular 
result is confirmed or not it is clear that our understanding of the structure 
of matter is still far from complete. 


Exercises 


19.1 Prove that the power sum s2 cannot be expressed as a sum of char- 
acters lifted from the rotation group. 


19.2 Show that for U in the unitary group 


det(1 —tV)~!} = 5 1"5,(U). 


r=0 
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19.3 
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Hence or otherwise, show that 


di(-t)"or SOS = 0. 
Deduce that 
S3(-1)" or Snr = 0, 
for n > 0. 
By writing 
[I + to3) = exp (© In(1 + tay)) 


and expanding the logarithm, or otherwise, show that, for any uni- 
tary matrix U, 


>> t"o,(U) = exp (- Y(-#)"s-(U)/r) : 
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Appendix Al A review of linear algebra and groups 


\t_will interest mathematical circtes that the mathematical Instruments cre- 
ated by the higher algebra play an essential part in the rational formulation 
of the new quantum mechanics, 


NIELS BOHR, lecture on Atomic theory and mechanics, 30 August 1925 


A1.1. Inner product spaces 


The basic algebraic ideas and results which we have used can be found 
in any textbooks on linear algebra and groups, but, for convenience, this 
appendix contains some of those most relevant to quantum mechanics. Al- 
though vector spaces can be defined over other fields we shall use only the 
complex numbers. 


Definition A1.1.1. A (complex) vector space V consists of a set of 
elements called vectors which forms an abelian group with respect to 
an addition operation written (u,v) +» u-+v, and which has a scalar 
multiplication C x V > V, denoted by (A,v) + Av, satisfying 


(A+ p)v = Av t+ pu 
(Ap)u = A(ur) 
A(u+v) = Aut Av 
lu =v. 


Definition A1.1.2. An inner product space, H, is a vector space 
which has a map H x H — C, written as (u,v) + (ulv), such that, 
for all u,v, w € V, and for all a, 8 € C, 

(i) (ulv) = (vlu) for all u,v € V; 

(ii) (ulav + Bw) = a(ulv) + B(ulw) for all u,v,w € V, and for all 
a, BEC; 

(iii) (ulu) > 0 for all non-zero vectors u in V. 

(It is a consequence of (ii) that (u\0) = 0 for any vector u € V and so 
in particular (0|0) = 0.) 
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As emphasiz€d in Section 6.1, quantum mechanics uses a different con- 
vention from that in most algebra texts in that the inner products are linear 
in the second vatieble and conjugate linear in the first. 


Definition: Al-1.3. The norm |lv|| of a vector v is defined by 


Ilul]? = (le). 


If |u| = 1 then v is said to be normalized. Two vectors u and v in H 
are said to be orthogonal if (ulv) = 0. 


Tf there are n@ HO. «°c. vectors orthogonal to all elements of a set S 
then S is said to span the =, +. In finite dimensions this is equivalent to 


the usual definition that ev... «-ctor is a linear combination of «iements 
of S, and in infinite dimensi = 2's -eery teeter 
combinations. 


wedsa Tontt of suck fiuesr 


Definition Aiei-4. A collection of vectors {v;} in H which satisfies 
(u;|v~) = 53% for all j and & is said to be an orthonormal set. An 
orthonormal] get of vectors is necessarily linearly independent, and if 
it also spans 11 then it is called an orthonormal basis. 


Every finite-dimensional inner product space has an orthonormal basis. 
Suppose, now, that one wishes to expand a vector w in terms of 2a infinite 
orthonormal spanning set of vectors {y,}. Setting ¢n =P — Via (Velv), 
we easily calculate that ||x||? = ||y||? ~ Soh1 |(velb)|2, from which the 
Bessel inequality follows: 


N 
Well? = So bal)? 


k=1 


The sequence vf partial sums on the right is monotonic increasing and 
bounded above, and so converges. This means that 


N 


lov — ball? = D> Mdelo)/? (Al.1) 


k=M+1 
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can be made arbitrarily small for M@ and N large enough. In infinite dimen- 
sions one makes a completeness assumption (see Section 6.2) which forces 
the sequence ¢yn to converge to a vector ¢. By construction, dy is orthog- 
onal to w, for k < N, and, going to the limit, ¢ must be orthogonal to all 
members of the sequence, and so zero. Since = @y + ees (del) ve, we 
may take the limit to obtain » = 772, (Weld) de. 


The Cauchy—Schwarz—Bunyakowski inequality A1.1.1. For 


Maly)? < Hell llyll? 


any x and y in H 


with equality if and only if 2 and y are linearly dependent. 


A1.2. Linear transformations 


Definition A1.2.1. Let V and W be vector spaces. A map T 
from V to W which preserves linear combinations, that is such that 
T(au + Bv) = oTu+ BT», for all a, 6 € C and all u,v € V, is called 
a linear transformation. 


The set of all linear transformations will be denoted by £(V,W). When 
V = W we shall abbreviate L(V,V) to L(V). When W is the one- 
dimensional space C we write V* for L(V, C), and refer to it as the dual of 
V. The elements of V* are called linear functionals. A linear transforma- 
tion is determined by its action on a basis. 


Definition A1.2.2. The matriz Tj, of a linear transformation T € 
L(V, W) with respect to bases {v;} for V and {wa} for W is defined 
by the expansion 


Tv; = > Teja. 
a 


When V = W the trace, trT, is defined by trT = })7;;, and may be 
shown to be independent of the basis used. 


| = ca 
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Definition A1.2.3. The sum of two linear transformations S and T 
from V to W is the map defined by (S + T)v = Su+Tv. Similarly, 
the product with a complex number \ can be defined by (AT)v = 
A(Tv). Both S$ +7 and \T can themselves be shown to be linear 
transformations, so that these operations give L(V, W) the structure 
of a vector space. Using matrices it can be shown that dim L(V, W) = 
dim V dim W, so that in particular dim V* = dimV. 
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Definition A1.2.6. The characteristic polynomial, yr, is defined by 
xr(t) = det(t1 — T), 


where the determinant can be calculated by choosing any basis and 
expressing T in matrix form. 


Definition A1.2.4. The image or range, im(T), of T is the set of 
vectors in W of the form Tv for some v in V. The image is a subspace 
of W and its dimension is called the rank of T. 

The kernel of T is defined by 


kerT = {v€V: Tv = 0}. 


The kernel is a subspace of V and its dimension is called the nullity 
of FT. 


Proposition A1.2.1. The linear transformation T € L(V,W) is 
one-one if and only if ker T = {0}, and is onto if and only ifimT = W. 


Definition A1.2.5. Suppose that T € L(V) and v € V is a non-zero 
vector for which there exists some complex number X such that 


Tv = rv; 


then v is said to be an eigenvector of T with eigenvalue x. 


If V is finite dimensional then a necessary and sufficient condition for \ 
to be an eigenvalue of T is that it be a root of the characteristic polynomial, 
that is xr (A) = 0. 


Definition A1.2.7. If S € £(U,V) and T € L(V,W) then the 
composition or product, T'S, is a linear transformation in £L(U, W). If 
S and T are both in £(V) then so is TS. In particular, powers of 
T € L(V) may be defined inductively by 


T =f, T" =T(T*-}), 


By convention T° = 1, the identity operator. 


Given any polynomial 
N 
p(t) = >> axt*, 
k=0 


we may form the corresponding operator 


N 
(T) = > agT* € L(V). 
k=0 


The Cayley—Hamilton theorem A1.2.2. If dim(V) is finite then 


xr(T) = 0. 
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A1.3. The spectral theorem 


Definition 41.3.1. Let H and K be inner product spaces, and 
TEL(H,K) @ linear transformation. The adjoint, T* € L(K, H), is 
defined by the identity 


(T*ulv) = (ulTv). 


In finite dimensions this identity defines 7* uniquely. In infinite dimensions 
this and the next definition need some modification. (See Section 6.2.) 


Definition A1.3.2. If H =K, then T and TJ are both in 0/713. Ve 
define T to be self-adjoint if T = I", that is if 


for all u and v in H. 


The spectral theorem A1.3.1. If H is a finite-dimensional space 
and T € L(H) is self-adjoint then there exists an orthonormal basis 
{vj} of eigenvectors, that is vectors satisfying 


(ujlve) = d5k- 


Tu; = Ajv; and 


Proof. We work by induction on n = dim H. First note that over C 
the characteristic polynomial has a root, A, and so there exists a non-zero 
vector v such thet Z’v = Av. We normalize this and set un, = v/ ae 
If m = 1 then vp, already provides an orthonormal basis of eigenvectors. 
Otherwise, define W to be the subspace of vectors orthogonal to v. For 
w & W consider 


(Twlv) = (w|Tv) = (wlAv) = A(wlv) = 0. 


This shows that Tw € W, and so T(W) C W. Let us write Tw for the 
restriction of T to W. The self-adjointness condition continues to hold when 
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the vectors are restricted to lie in W, and that subspace has dimension less 
than that of H, so applying the inductive hypothesis we may assume that 
W has an orthonormal basis v1, v2,...,Un—1 of eigenvectors for Ty, that is 
Tv; = Tov; = Aju; for j = 1,2,...,n—1. Adjoining up to this set provides 
a suitable orthonormal basis of eigenvectors for H. oO 


Definition A1.3.3. If T € H is self-adjoint then (T'ulu) is real for 
all u € H. If, further, (Tulu) > 0 for all u € H then T is said to be 


positive. 


Definition A1.3.4. A self-adjoint linear transformation P € H 
which satisfies 
p? = P= p* 


is called a projection. 


Definition A1.3.5. A linear transformation U € H which satisfies 
U*U =1=UU* is said to be unitary. 


This definition means that (Vu|Uv) = (u|v), for all vectors u and v, so 
unitary transformations preserve the inner product. 


The spectral theorem for unitary transformations A1.3.2. 
If H is a finite-dimensional space and U € L£(H) is unitary then 
there exists an orthonormal basis {v;} of eigenvectors, that is vectors 
satisfying 

Uy; = dv; and 


(vj|uz) = yk. 


This is proved by the same technique as the theorem for self-adjoint 
operators. With respect to the basis {v;} the matrix of U is diagonal. 
Since the diagonal entries are the eigenvalues this diagonal form is unique 
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up to the ordering of the eigenvalues. Moreover, each A; has modulus 1, as 
one sees by noting that |A,|?||v;{|? = {|Uv;l|? = |lu,{l?. 

We conclude this review of linear algebra with a variant of a well-known 
finite-dimensional result which shows that when two operators commute 
one may find simultaneous eigenvectors of both. 


Proposition A1.3.3. Let A and B be self-adjoint operators on the 
inner product space H, let 714 and Hg denote the subspaces spanned 
by eigenvectors of A and B, respectively, and let 14,5 denote the span 


of the vectors which are simultaneously eigenvectors for both A and 
B. if AB = BA then Ha,p = HaN He. 


Proof. If # is in the eigenspace ker(A — a1) for A then 
(A-al)By = B(A-al)p = 0, 


so that By € ker(A—a1). This shows that B preserves each A-eigenspace, 
and so also preserves their span, H,. For ¢ € H = and w € Ha we have 


(Belp) = (@|By) = 0, 


so that B(Ht) C HZ, too. 
By considering the components of a B-eigenvector in Hs @H ‘4 we then 
get 
ker(B — 61) = Ha Oker(B — 61) @HZ Nker(B — 1). 


Summing over all eigenvalues @ we obtain 


He = (Ha nker(B - B1)) © GD (Ha Nker(B - 61), 
B B 


and so 
Hal He = CB (HaNker(B — f1)). 
B 
On the other hand comparing the ker(A — a1) components of By and fy 
for y € Ha Nker(B — £1) we see that the ker(A — a1) component of > is 
in ker(A — a1) Nker(B — £1), so 


HaN He = GD (ker(A — eel) nker(B — 81)) C Ha,s. 
a,B 
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The reverse inclusion is obvious, so the result now follows. oO 


Corollary A1.3.4. If AB = BA and H admits an orthonormal basis 


of eigenvectors for A; then H4,5 = Hep. 


Proof. Since H admits an orthonormal basis of eigenvectors for A we 
have H4 =H, so that 


Hap =HOAHB=He. oO 


A1.4. Groups 


Definition A1.4.1. A group is a set G with a multiplication map 
G x G — G, denoted by (x,y) + xy, and an identity element 1 € G 
such that 

(i) (zy)z = x(yz), for all x, y, and z in G; 

(ii) le = x = 21, for all x in G; 

(iii) for cock zx in G there exists an inverse z—1 € G, such that zz7} = 
1=—-2z~*a. 


Definition A1.4.2. A group G is said to be abelian if zy = yx for 
all z and y inG. 


Definition A1.4.3. A subgroup H of a group G is a subset. such that 
for all x and y in H, z~1y is also in H. It is a normal subgroup if, for 


all g in G and all x in H, g~'2q is in H. 


Conjugation by g € G is defined to send x € G to gzg™!. In abelian groups 
conjugation has no effect, since x = gxg—!, so every subgroup of an abelian 
group is normal. 
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Definition Al.4.4. If N is a normal subgroup of G then the cosets 
aN ={ane Gine N} can be given a multiplication 


(2N)(yN) = cyN 


with respect to which they form a group, G/N, called the quotient 
group. 


Definition 41.4.5. Let G and H be groups. A map ¢ from G 
to H such that ¢(x)¢(y) = ¢(zy) for all x and y in G is called a 
homomorphism. 


Definition Alid.0. fs eer phe ise homomorphism which ts 
one-one and onto. If there exists an isomorphism from G to H then 


G and H are said to be isomorphic. 


Definition Al.4.7. The kernel of a homomorphism ¢ is the set 


kerf = {2 € G: $(x) = 1}. 


The image isthe. ~ imd = {¢(z) ¢ H: 2 € Gh. 


Theorem Al.4.1. A homomorphism ¢ is one-one if and only if 
ker(¢) = {1}, and is onto if and only if im(¢) = H. 


The first isomorphism theorem A1.4.2. The kernel of a homo- 
morphism ¢ : G — H is a normal subgroup of G and the quotient 
group G/ker(¢) is isomorphic to im(¢). 


Appendix A2 Open systems 


If | have understood correctly your point of view then you would gladly 
sacrifice the simplicity [of quantum mechanics] to the principle of causality. 
Perhaps we could comfort ourselves that the dear Lord could go beyond 
[quantum mechanics] and maintain causality. 


WERNER HEISENBERG, letter to Elnstein, 10 June 1927 


In this section we shall show how to combine a two-dimensional quan- 
tum system with an ‘environment’ whose time evolution drives the original 
system asymptotically towards a collapse of the kind that occurs during 
a measurement. More precisely we prove the following result, stated in 
Section 16.7, 


Theorem A2.0.1. Let V be a two-dimensional inner product space 
and Q a vector in V. Then there exists an inner product space H, a 
family of unitary operators U;, a homomorphism ¢ : L(V) > L(H) 
which respects adjoints, and a linear transformation which takes a 
vector ~ € V to © € H such that for all A € L(V) we have 


(G16(A)B) = (WlAd) 
Jim (Up¥[6(A)ULW) = (2140), 


where the inner products on the left are in 7 and those on the right 
are in V. 


Proof. We take the space H to be the direct sum of spaces Hy, for n = 
0,1,2,..., where Hp = C, and, for n > 1, H, is the space of antisymmetric 
normalizable wave functions y, on R”, that is wn(s1,52,...,5n) changes 
sign whenever any two of its arguments are interchanged. In particular, H1 
is just the space of normalizable wave functions on R, and Hz2 is the space 
of normalizable wave functions on R? such that ¥2(s1, 82) = —(se, $1). 
The time evolution operators U; are defined on H, by 


(Urtn)(81,---5Sn) = exp[—i(si +... + Sn)é]n(51,.--, Sn). (A2.1) 
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Let f be a normalized function on R and define for each n an operator 
€: Hy — Hn41 by 


(en) (S11 $2) +++) 8n+1) 
1 nt+1 


= vniti So (-1)* F (sk) Pn (si, veeySh—-1,S5k41)0°- »Sn41): 
i (A2.2) 


(This is essentially just multiplication by f; the sum is there only to ensure 
that ew, is an antisymmetric function.) 

‘We also define a map e* that sends Ho to 0 and maps Hn+1 to Hn, for 
n> 0, by 


(e*tn41)(81,-++38n) =Vn+1 J Foinea(s.s0 .++;Sn)ds.  (A2.3) 


Using the antisymmetry of %n+1, this can also be written as 


eas ‘ d A2.4) 
(-1) | F (8) n41 ($1) ++ 63 Sk—1) S$, Sk41)+ ++) 8n) Ss, ( . 

vn+i1 pa R 

from which it is easy to check that 


(onle*Yn41) = (ebnlyn+i), (A2.5) 


so that e* is the adjoint of e. Using the antisymmetry of the function ¢n+1 
we see that e*?n41 is given by 


V n(n + Hf F(s)f(t)bnai(s,t, U1, tee )Un—1)dsdt =0, (A2.6) 
R? ! 


so that e*? = 0. Taking adjoints we also have e? = 0. The remaining 
important identity relates e and e*. On the right-hand side of the identity 


. = | f(s)(e%n)(s,u1)---,Un)ds. A2.7 
(e*etn)(s1,---s Sn) Ric: )(S,Uay-++,Un)ds (A2.7) 

The expression ew, is the sum of f(s)}n(ui,..., un) and 
Po S(-D* F(ur) Pals, Uly +++, Uk—1; Uk415--- ;Un). (A2.8) 


When we integrate against f(s) this latter expression gives 


—(ee"n)(Uay- ++ Un)s (A2.9) 
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whilst the former gives || f||?~,. We therefore obtain e*etn = ||f||?Un — 
ee* Pn, or e*e + ee* = || f ||? = 1. Summarizing we have 


2 


2 
e =0=e 


and = ee* + e*e = 1. (A2.10) 


To describe the homomorphism ¢ we choose an orthonormal basis v7}, v2 
for V with v2 = 2 and write A € £(H) as a matrix. Then we set 


O(A) = Ajree* + Agie* + Aipe + Agze”e. (A2.11) 
This clearly depends linearly on A and, using the formulae for products of 
e and e*, it is easy to check that ¢(AB) = $(A)¢(B), (1) = 1, and also 


that ¢(A)* = ¢(A"), so that ¢ preserves self-adjointness. 
Now, for any @ € H, we have 


(U,V |¢(A)U,B) = A1i(U,¥lee*U, 8) + Ao (Up¥le*U,V) 
+ Ai2(U,GleU,¥) + Azo(UzVle*eUB). (A2.12) 


For any ® € H, and ®’ € Hy+1 consider 


(U,®le*U,%") = il 


Ret 


e~*'@(ur,..., Un) f(s) O'(s, u1,..., Un)dsd™u. 


(A2.13) 
By the Riemann-Lebesgue lemma this integral tends to 0 as t > oo. Sim- 
ilarly, for Y € Hn41, 


(U,Wlee*U,¥) = |le*U, BI]? (A2.14) 
can also be written as 
/ elr—3)t Fry U(r, U1,.--,Un) f(s) (s,u1,...,Un)drdsd"u, (A2.15) 
Rrt+2 


which tends to 0 as t + oo. The first three terms in the expression for 
(U,V|$(A)Uz¥) therefore tend to 0 as t = oo. The remaining term can be 
calculated by observing that 


(U,¥le*eU,V) = (U,8\(1 — ee*)ULY). (A2.16) 
It follows from the previous discussion that for large t¢ this approaches 


[|Ue¥ |}? — 0 = HI? (A2.17) 


352 OPEN SYSTEMS 


Recalling that Azz = (9|AQ), we may combine these results for normalized 
W to obtain 


(U,¥|¢(A)ULY)  Azall Yl? = (Q|AQ). (A2.18) 
Finally, given Y = ¥1v1 + Pav2 € V, we set 
UW = dif +42 € M1 OH. (A2.19) 


This can also be written as (Wie+ 2)1 where 1 is the constant function in 
Ho. By direct calculation we have 


(B[V) = diyall fl? + bode = (Aly). (A2.20) 


Bearing in mind the formulae for products of e and e*, and the fact that 
e*1 = 0, we see that 


@(A)Y = (Arree* + Aare” + Aize + Aaze"e) (tie + Y2)1 
= (Artie + Azthrete + Arotoe + Azatiee*e)1 
= i Aty + Ans bo) f + (Agiv1 + Agata), (A2.21) 


which is the image of Ay. Using equation (A2.20) this means that 
(H\d(A)¥) = (YAY), (A2.22) 
as required. o 


This argument merely shows that it is possible to mimic the effects of a 
projection asymptotically, but it can be generalized to give more plausible 
models. These occur naturally whenever one has a dissipative system (one 
that loses energy to its environment). The space H, can be regarded 
as describing n-particle states of the system and its environment, with 
$1,---;5n defining the frequencies of the associated waves. The space V 
describes the system and its observables can also be regarded as observables 
on H by using the homomorphism ¢. The sort of wave function, f € H1, 
that arises in these cases is 


fe 21 ( a, a ). (A2.23) 


“a \s2 —w? + 2ins 


The parameter w plays the role of a natural frequency of the system (or of 
the system together with the measuring apparatus) and 7 is the strength 
of the coupling between the system and its environment. It is also pos- 
sible to be more precise about the rate of convergence, which is actually 
proportional to (f|U:f). This is a multiple of exp(—7t). The stronger the 
coupling to the environment the faster the collapse. 
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Some physicists and mathematicians 


This list includes just some of those who contributed to the development 
of quantum theory or its mathematical techniques. The abbreviation N 
stands for a Nobel Prize in Physics (except in the case of Rutherford where 
the award was in Chemistry). Heisenberg’s 1932 prize was actually awarded 
a year later in 1933. 


John Stewart BELL (28 Nov 1928-1 Oct 1990) 
Niels Henrik David BOHR (7 Oct 1885-18 Nov 1962, N 1922) 
Max BORN (11 Dec 1882-5 Jan 1970, N 1954) 


Louis-Victor Pierre Raymond, Prince de BROGLIE (15 Aug 1892-19 Mar 
1987, N 1929) 


Paul Adrien Maurice DIRAC (8 Aug 1902-20 Oct 1984, N 1933) 
Albert EINSTEIN (14 Mar 1879-18 Apr 1955, N 1921) 

Enrico FERMI (29 Sep 1901-18 Apr 1954, N 1938) 

Richard Phillips FEYNMAN (11 May 1918-14 Feb 1988, N 1965) 
Werner Karl HEISENBERG (5 Dec 1901-1 Feb 1976, N 1932) 

David HILBERT (23 Jan 1862-14 Feb 1943) 

Wolfgang PAULI (25 Apr 1900-4 Oct 1958, N 1945) 

Max Karl Ernst Ludwig PLANCK (23 Apr 1858-4 Oct 1947, N 1919) 


John William Strutt, Third Baron RAYLEIGH (12 Nov 1842-30 Jun 1919, 
N 1904) 


Ernest, First Baron RUTHERFORD of Nelson and Cambridge (30 Aug 
1871-19 Oct 1937, N 1908) 


Erwin SCHRODINGER (12 Aug 1887-4 Jan 1961, N 1933) 
John Louis von NEUMANN (3 Dec 1903-8 Feb 1957) 
Hermann WEYL (9 Nov 1885-8 Dec 1955) 

Eugene Paul WIGNER (17 Nov 1902-1 Jan 1995, N 1963) 
YUKAWA Hideki (23 Jan 1907-8 Sep 1981, N 1949) 
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Further background reading 


There are so many possible applications of quantum theory and devel- 
opments of its mathematical structure, that we mention only the totally 
different approach described in Quantum mechanics and path integrals by 
R. Feynman and A. Hibbs, McGraw-Hill, 1965. We reluctantly omitted 
this very appealing description of quantum mechanics from the main text 
because it is much harder to give a mathematical justification of its meth- 
ods except in simple cases. Those wishing to discover more about the 
mathematical approach to quantum mechanics could consult Methods of 
mathematical physics, Volume I, by M. Reed and B. Simon, Academic 
Press, 1972. 

One can obtain a good impression of the range of further possibilities 
from The quantum universe by T. Hey and P. Walters, Cambridge UP, 
1987, and Chapters 5, 10, and 13-18 in The new physics, edited by P. 
Davies, Cambridge UP, 1989 (particularly Chapter 13 on The conceptual 
foundations of quantum mechanics by A. Shimony). Most of the classical 
papers on quantum measurement and the paradoxes it engenders have been 
collected into the volume Quantum theory and measurement, edited by 
J.A. Wheeler and W.H. Zurek (Princeton UP, 1983). A selection of Bell’s 
papers on the subject have been collected into Speakable and unspeakable 
in quantum mechanics (Cambridge UP, 1987). The book The philosophy 
of quantum mechanics by M. Jammer (Wiley Interscience, 1974) is still a 
useful account of various interpretations of the theory. 

The series of volumes on The historical development of quantum theory 
by J. Mehra and H. Rechenberg (Springer-Verlag, 1982-) contains a wealth 
of historical information. Many of the original papers on quantum theory 
are printed (in translation) in Sources of quantum mechanics by B.L. van 
der Waerden, Dover, 1967. Another useful survey of the history is given by 
A. Pais in Inward bound, Oxford UP, 1986, and The Born-Einstein letters, 
edited by H. Born, Walker, 1971, provide an insight into the problems and 
excitement as the new discoveries were made. 

Autobiographical works by quantum physicists include the following. 


M. Born, My life, Taylor and Francis, 1978. 
R.P. Feynman, “Surely you must be joking Mr Feynman!”, Unwin, 1985. 
and “What do you care what people think?”, Unwin, 1988. 
W. Heisenberg, Physics and beyond, Harper and Row, 1971. 
H. Yukawa, Tabibito, translated by L. Brown and R. Yoshida, World Sci- 
entific, 1979. 

There are also numerous biographies of individual scientists, and the 
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following represent just a small selection. 

A. = French and P.J. Kennedy, Niels Bohr: Centenary volume, creer) 
198 

A. Pais, Niels Bohr’s times, Oxford UP, 1991. 

H. Kragh, Dirac a scientific biography, Cambridge UP, 1990. 

A. Pais, “Subtle is the Lord”,.Oxford UP, 1982, and Einstein lived here, 
Oxford UP, 1994. 

J. Gleick, Genius, Little, Brown and Company, 1992, 

J. Mehra, The beat of a different drum, Oxford UP, 1994. 

D. Cassidy, Uncertainty, W.H. Freeman, 1992. 

J.L. Heilbronn, The dilemmas of an upright man, University of California 
Press, 1986. 

D, Wilson, Rutherford, Hodder and Stoughton, 1983. 

W. Moore, Schrédinger, Cambridge UP,-1989, and its abridgement. 
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Epilogue 


When Hitler became German Chancellor in 1933, most of the German and 
Austrian physicists involved in the development of quantum theory left the 
country, whether Jewish or not. Of the more prominent, only Heisenberg 
and the elderly Planck remained, hoping to keep physics alive despite the 
political difficulties. (Planck’s one remaining son was executed in February 
1945 for conspiring to assassinate Hitler.) Johannes Stark, who as a young 
man had been an early proponent of relativity and the quantum hypothesis, 
was by now a bitter opponent of what he termed ‘Jewish physics’, which 
included anything modern and abstract. When Heisenberg wrote an article 
in which relativity theory was mentioned favourably, Stark denounced him 
to the Gestapo. 

After fleeing from Berlin to Oxford, Schrodinger made an ill-advised re- 
turn to his native Austria in 1936, only to become a refugee again when 
Hitler annexed the country two years later. He eventually became the 
first Professor of Theoretical Physics at the Institute of Advanced Stud- 
ies in Dublin. Born left Germany in 1933 and after a couple of years in 
Cambridge became Professor at Edinburgh. Bohr managed to escape from 
Copenhagen during the Nazi occupation, and was smuggled out of Sweden 
in the bomb bay of a Mosquito, almost dying of asphyxiation on the way. 
He joined many of the other mathematicians and physicists who had taken 
part in the development of quantum theory in the United States. Fermi 
supervised the building of the first nuclear reactor which went critical on 
2 December 1942 in the university squash courts in the centre of Chicago. 
Many other physicists were involved directly or indirectly in the Manhattan 
project to develop an atomic bomb, a project which combined the indepen- 
dent programmes started in Britain and the United States. In each case the 
project had been driven by the realization of some of the refugee physicists 
that the 1939 discovery of chain reactions by Hahn and Strassmann had 
made such weapons possible, and that Germany possessed the greater po- 
tential to build them. In fact, having extricated himself from the Gestapo, 
Heisenberg had been put in charge of the German bomb project and, for 
whatever reason, had carried it forward only slowly. 

After the war the leading German scientific society, the Kaiser Wilhelm 
Gesellschaft, was renamed the Max Planck Gesellschaft in Planck’s honour. 
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Hints for the solution of selected exercises 
pete ee re at Seat a 


Chapter 1 
1.1 (i) The energy of each photon is 


Fw = (1.0546 x 10734) x (20 x 2 x 105) = 1.352 x 10778 J; 
(ii) 
fw = (1.0546 x 10734) x (2a x 4.95 x 10!) = 3.2800 x 10-19 J; 
(iii) 
fw = (1.0546 x 10734) x (27 x 1078) = 6.6242 x 107° J. 


1.2 We have 200kW = 2 x 10° Js~!, which, using the answer to 1.1(i), 
is sufficient to create 


2x 105 


1.3252 x 10-28 = 1.5092 x 10°3 


photons per second, if used with 100% efficiency. The aerial occupies 
1 square metre out of 47 x (10)”, so that we would expect it to be 
hit by 
1.5092 x 1033 
4x x 1012 
photons each second. 


Were it to be at a distance of 3000 million kilometres, that is 3 
million times further, then it should be struck by 


= 1.2010 x 107° 


1.2010 x 102° 


2. 7 
9x io = 1.334 x 10 
photons per second. 


Chapter 2 


2.1 The wave number, k, of the photon is 


2m _ 1.0668 x 10? m=? 
5e0x 10 ee 
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2.2 


2.3 


HINTS FOR THE SOLUTION OF SELECTED EXERCISES 


so, writing v for the recoil velocity of the atom, the de Broglie law 
gives 


(3.82 x 107*6)y = (1.0546 x 10-34).(1.0668 x 107) = 1.1250 x 1077", 


from which we deduce that v = 2.945 x 107? metres per second; or 
a little over an inch per second. 
une Schrodinger equation can be written in the form 


--y! = (E oT Vo)v, 


which makes it obvious that it is the same as the square well, treated 
in Section 2.2, except that F has been replaced by E — Vo. Arguing 
as there, we get 


from which the result folows. 
When we substitute y = X(x)Y(y) into the Schepdinger equation 
and divide by XY, we obtain 


2 wt uw 
aie (e+ 7) = E, 


in which the variables are separated. We may therefore deduce that 
—(h?/2m)X"/X is a constant, which we may write as E,, to get the 
equation 

Rn? 

—=-X" = E,X, 

2m ! 
which is the one-particle Schrédinger equation. The boundary con- 
ditions still force X to vanish when x = 0 or a, so that BH, = 
j?n?h? /2ma?, for some positive integer j7. Combining this with the 
result of a similar argument for Y then gives 


2 12 hi? k? a? A? 
2ma2 2mb2 ’ 


where k is also a positive integer. The energy 542h?/2ma? is ob- 
tained by taking j? + k? = 5, which means that j = 2 and k= 1 or 
j =1 and k = 2. The wave functions are, in general, 


| 
| 
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which reduces to ‘ : k 
_ juz. kr 
y= = sin 2 gin 4 
a a a 


when a = b. The required probability is the integral of |7|? over the 


- region where z < y, which turns out to be 4 


2.4 


2.5 


Schrédinger’s equation gives 


so that ry satisfies the one-dimensional Schrédinger equation. More- 
over, we are told to assume that w(a) vanishes, and if ~ is finite at 
the origin then ry vanishes there, so the boundary conditions on ry) 
are the same as in one dimension. The only thing that changes is 
the normalization condition, which, on using the formula for the vol- 
ume element in spherical polar coordinates and integrating out the 
angles, becomes 


1 = [wear =a [ror 


The probability is 3. 
The mean z coordinate is. given by 


[ awPay, 
r<a 


and this vanishes since the integrand is an odd function in the sphere. 
Arguing similarly for x and y, the mean position is at the origin. The 
variance of the height is therefore given by 


| Zp? av. 
Tr<a 


By symmetry this must be the same as the variances of z and y, and 
so all three are given by the average 


gf ++ 2ywrav = 3 / 
r<a 


rca 


r7\p|? aV, 


in which form it is easy to calculate the answer 


a* /1 1 
3 \3 Qn2n2 } 
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2.6 Notice that 


which evolves to 
1 
J/2 


where Ey, = n?x?h? /2ma?. 


[eBay + eat lh | ; 


2.7 After the barrier has been removed we must change the box length 


to 2a; this gives the formula for the energies and shows that the 
normalized wave functions are 


Ua(z) = Zesin (F), 


for x € (0, 2a]. One must then expand the wave function 


(a) = {y2 sin (2) 


as > Cnn. For unchanged energy one needs n = 2, and so the 
probability is |co|? which turns out to be 5. 


if x € (0, a] 
otherwise 


2.8 The condition for a vanishing probability current is 


0= oy’ -Wy, 


from which we deduce that v/% = (#/|p|)? is independent of x. Set 
d= [l/y. 


Chapter 3 


3.1 The normalized wave functions are 


2 
yi(z) = V =e wo, he(x) = 5 om 


3.2 The potential energy can be written as 


ear oe i eran Rte eter ee a mate mmei ono aires eee 
e : 4" : : 
! 


CHAPTER 3 


3.3 


3.5 


3.6 


3.7 
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The eigenvalues of the square matrix are 4 and 16, so that the nor- 
mal frequencies are /4w = 2w and /l6w = 4w and the quantum 
mechanical energies are given by 


E = (n+ §) 2hw + (no + $) 4fiw = [(n1 + 2ng) + 3]2hw. 
The degeneracies are determined by the number of ways of expressing 
a given positive integer as n, + 2ne. 
Separate variables by taking (x, y, z) = X(x)¥ (y)Z(z). The oscilla- 


tor then splits into three one-dimensional oscillators and the possible 
energies are found to be of the form 


E = (m + no + ng + 3) fw. 
The degeneracy of the energy level (N + 4)hw is 4(N +1)(N +2). 
On substituting y = z + ¢/mw? into the Schrédinger equation 
A? “ 1 2 a2 
arto + 5mw a’ +ex = Ey 

we see that 

dp 1 22 é 

om ay? a Y= (B+ 55m) + 

from which the result is easily deduced. 
On substituting 7 = R(r)O(6) into Schrédinger’s equation and mul- 
tiplying by r?/RO, we see that @”/© must be constant. The con- 


dition that ©(6 + 2m) = 0(@) forces that constant to be of the form 
—I?, with J an integer. The radial equation then takes the form 


2 2 
ge (Fory = =) + smu? R = ER. 


One then looks for the asymptotic form ¢ and tries R(r) = f(r)d(r), 
with f given by a series, exactly as in the one-dimensional case. The 
lowest energy solutions are given by 


wr, 6) _ Neil@e-mur?/2h 


where | = 0 gives the ground state and / = +1 gives the doubly 
degenerate first excited state. 
Work in the momentum representation, where 


2 1 
(Fo) (p) = eee a 


and time evolution is given by multiplication by exp(—ip?/2mhi). 


; 
i 
i 
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Chapter 4. 
4.1 The wave functions are 
4 : 3 1 Zr/2 
ae other. = a 
pe(r) = (=a) ¢ Zr) e ; 
1 
Za \2 QZr  _2Zr? 
Pe | ae —Zr/3a_ 
valr) € ( 3a ‘ 27a? ) 7 
4.2 Substitute = R(r)O(@) into Schrédinger’s equation, and multiply 
by r?/RO© to separate the variables. Bearing in mind the fact that 
(6) = O(6 + 27), one deduces that ©” = —1?© for positive integral 
l. Treat the equation for R by the usual procedure of looking for 
a series solution times an asymptotic solution. The degeneracy is 
2N +1 (recall that +/ give the same equation for R). 
4.3,4 Treat the radial equation by the usual combination of asymptotic 
and series solution. 
4.5 The energy eigenvalues are 
n? 
E= om [4nn + K(2c; + 1 — 2Ka?)] 
where 
o = 24+ (14+4)? + x20? 
The factor of 4 rather than 2 results from the fact that the power 
series contains only even terms. 
4.6 Set w = 0 and « = mw/fi in the previous question. 
4.7 Substitute ~ = UVW and multiply by uw/UVW to get an equation 


in which w separates from u and v. Since w + 27 and w define the 
same point the resulting equation for w has solutions only if W is a 
multiple of exp(iuw) for some integer ». When this is substituted, 
the equation for U can be written in the form 


for some constant of separation A. A similar equation is satisfied by 
V but with A replaced by another constant B such that A+ B= 
4Z/a, with a the Bohr radius. The usual search for asymptotic and 
series solutions leads to the formula for E,, and the degeneracy n?. 


pet 
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Chapter 5 
5.2 It is more efficient to assume that the potential takes the constant 


5.4 


value V2 inside the interval {0,a] and V; outside it, so that part (a) 
is the special case of V; = 0 and V2 = Vo, whilst in (b) it is V2 which 
vanishes and Vy = VW. 
The wave function 7% must vanish for + < 0, so the boundary condi- 
tions are that (0) = 0 and that w and 7’ are continuous at z = a. 
Taking 

(x) = Ae **? 4 Bethe 


in the region x > a, the boundary conditions give 
Beika (1 _ *% tan(koa) | = Ae~** (14 uy tan(kga) } , 
ko ko 


where k? = k? + 2mVo/h?. Since k and ko are real this shows that 
|B| = | A, and enables us to deduce that B = Aexp(i¢), where 


tan (56 + ka) = sal tan(koa). 
2 ko 


5.5 Use the matching conditions at 0 and a and the condition that a 
bound state wave function must tend to 0 as x — too. 

5.6 As in Exercise 2.4 the equation and boundary conditions for r7(r) 
are similar to those in one dimension. For normalizable states r7)(r) 
must tend to 0 as r — oo. The limit of large Vo can conveniently be 
studied by writing the condition in the form 

hi? 
in(ka) =.+4/——-k 
sin(ka) amva 
and considering where the graph of sin(ka) meets the line through 
the origin with slope yh? /2mVo. 
5.7 The probability current is Ax(AB — BA)/im. 
5.8 Consider (p — ay) /b. 
Chapter 6 
6.1 It is clear that £y(A) is real if and only if (| Ay) is real, that is 


(plAv) = WIAD) = (AdIp). 
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We are told that this holds for all y, but what we really want is 
(p|Aw) = (Ag|d) for all p and ¢. To bridge the gap, first show that 


3 


(SlAd) = SS id + GAC + #9), 


n=0 


and ther. epply the known identity. 
6.2 The expectation value of a self-adjoint operator is real, but in the 
state p ~ exp(—Kr?/2) we find that 


co 
E (-#5,) = int [ Anree" dr, 
Or 0 


which is imaginery. The inner product in R* can be written as 
(ol) = / rorwpsin @ drdéd¢, 


and 


we, ; co aN. 
oar = ih (a; ive a g, 
from which the other part follows easily. 


6.3 The eigenvectors of P with eigenvalue 1 are given by even wave func- 
tions and those with eigenvalue —1 are given by odd wave functions. 


6.4 We have 
(PVP)(2) = (VP)(-2) = V(-2)¥(—-2) = V(-2)(PY)(2), 


which shows that even functions V commute with P. By applying 
the chain rule or by Fourier transforming we can get PP i —PP, 
from which it easily follows that P commutes with the kinetic energy 
P?/2m, If the system is in a non-degenerate eigenstate w, one first 
shows that this is also an eigenstate of Py, and then uses 


(PY|PPY) = (bIPPPY) = —(1PY), 


to deduce that the expectation value of P must vanish. 
6.5 The normalized eigenvectors corresponding to the eigenvectors 3hw, 
0, and —8fw can be taken to be 


2 2 1 
1 en oe wee 
b+ ay (2) ’ go = 3 (1). g- 3 
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respectively. The initial vector can be expanded as 


Wo = (b4+lY0)o4 + (PolVo)do + (b-\vo)o- = 2G4 of 24) + 44_, 


and therefore evolves to . 
be = Jeb, + 2g + per g_. 


Writing 2 for the second standard basis vector, the probability, po, 
is given by 


[Kealds)|? = | 267" (ealb4) + 3(caldo) + de™*{eal4.) |”. 
The result follows on calculating the inner products and simplifying 
the result. 

6.6 This can also be done by expanding in eigenvectors, but it is easier 
to notice that, in terms of a standard basis €, and €2, the initial 
vector Wo = €1 evolves to y, = exp(—iHt/h)e, and that the required 
probability is 

[(eqle*##/* 1). 
It is easily checked that H? = fi7e?|B|?/4y?, from which one may 
calculate that 


“iHt/h = ong (2BIE) _ iat, (elBlet 

e€ cos ( om fielB| sin ma)? 
and the lower left-hand matrix elements now give the result. 

6.7 Use 


d wd a) 
in (ald) = (~in Ely) + (wn), 
and substitute from Schrédinger’s equation. 
6.8 Suppose that orthonormal bound state eigenfunctions before the nu- 
clear decay are Wo, ¥1,.--, With wo, the ground state, and afterwards 
are ,},-.... The probability of finding the new ground state en- 


ergy when one has the old ground state wave function is |(¥|Wo)|?. 
The normalized ground state wave functions are 


[23 _gey , ‘Aa 
Yo = nae whee po = Fe i r/o 


so we calculate that 


(volvo) = 


The answer follows on evaluating the integral. 
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Chapter 7 
7.2 Use Proposition 7.1.2 repeatedly. 
7.3 Use Lemma 7.2.1 with A= P and B=X. 
7.7 For part (iii) recall that E(T) + E(V) = E. For part (iv) note that 
E(T) must be positive. 
7.10 Once one has shown that N is a projection, it follows that the ex- 
pression ~ = Nw + (1 — N)y decomposes an arbitrary vector into 
eigenvectors of N with N(Nw) = Ny and N(1— N)p = 0. Using 


the identity 
llav||? = (WIN) 


it is easy to see that the kernels of N and of a coincide. Let 
{¥1,¥2,...} be an orthonormal basis for the kernel of N. It can 
be shown from the commutation relations between N and a* and a 
that {a*y,a*2,...} is an orthonormal basis for the image of N. 
The space is spanned by the image and kernel of N, so we may form 
a basis daj41 = W; and daj+2 = a*y,;, with respect to which we 
obtain a matrix representation of the operators. 

7.11 Introduce the operators P = P;+4eBX2 and X = 4X,—P2/eB, and 
show that they satisfy the canonical commutation relations. Since 


P2 ¢?p? 


ones 2 
5 ma 


the Hamiltonian has a harmonic-oscillator-type spectrum. (This ex- 
ample, due to Lev Landau, is of considerable importance in solid 
state physics.) 

7.14 With generating functions we may calculate 


2 se ty (bn |X* bm) 
M,N=0 


1 
= (= )° eel [u — (s + t)|Re~muu /t de. 
R 


When k = 1 this gives 
> sN 4M Ih 
= (vn |X) = —(8 + the?mvst/h, (19.24) 
NIM! 
M,N=0 


from which it follows, by comparing coefficients, that (y|X a) 
vanishes unless |N — M| = 1. Alternatively, we have the identity, 


(on |Pym) + imo(hn|Xdm) = (bn lasdm). (19.25) 
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This expression vanishes unless |V — M| = 1, because ayy is 2 

multiple of ya¢31, and therefore orthogonal to %y unless N — M = 
+1. Adding and subtracting, we see that (~n|Xwm) and (n|Pdm) 
both vanish unless |N — Mj = 1. | 


7.15 Write 
bo(2) = e™2'/48G (_fa,z) 
and use the known evolution of G, to see that 
W(x) = e7 Ft e~ mwa? /4h (-ge"“a, x) . 

7.16 This result, known as Floquet’s theorem, and its three-dimensional 
analogue, Bloch’s theorem, are basic to solid state physics. See also 
Section 12.6. 

7.20 Multiply equation (7.52) by exp(—2mwa?/h) and integrate against 
a. 

Chapter 8 

8.4 Note that the mass in this case is M not m. There will not be 
bound states unless the potential is attractive, which means that the 
Lz eigenvalue, m, must be negative. As usual we also need J > |m|. 

8.5 It is simplest to find the eigenstates, 7, which have L3 eigenvalue th 
as the solutions of Li7) = 0. The remaining state, with L3y = 0, 
can be found as a multiple of Ly where Li = 0. 

8.7 Show that 

[L?, X,] = thejns (LeXi + XpLi) 

and recall that L? commutes with the components of L. 

8.8 When ~ is an eigenstate Exercise 8.3 gives 


8.10 
8.11 


8.12 


(WILId) = (PIL3v) = 3(p| (L2 + 12) 4), 


which enables one to reduce (1|H) to an expression which involves 
only expectation values of L*, L2, and L3. 

This can also be done by the method of Exercise 7.11. 

Use the identity 


ia __ (ia. oF (ia. oer 
ee = De tL Geet Qk+ IF 


= pee la (i lal)?* aS wey (ifa})?**2 
“(2k)E- ark (Qk +1) + It" 
You can define ¢, as a normalized solution of A*¢?, =0, and ¢_ = 


h-1Ads. Then you need only check that these are S$3-eigenvectors 
and that A*¢_ = hd,. 
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8.13 The case a = 0 corresponds to the canonical commutation relations 
and @ = 0 to the angular momentum operators. 

8.14 The commutation relations given in (ii) provide a simple example 
of a generalization of a group, called a quantum group. These have 
recently been the object of intense study by mathematicians and 


physicists. 


Chapter 9 
9.2 The point of the first part is to show that if gk = g2k then 


(U(g1) Qh) = (U(g2) QI), 
so that T is well defined. 


Chapter 10 
10.1 First note that 


PUtjnPUtjn ++» PUtjnP = (PUtjnP)” P 
PV 
é [ + PHP+0 (5)| P. 
n nr 


10.5 For the last part note that 


2 
2 T nr sin’ 
fe eee ee i, 

(n + 1) sin nt) ae 
where ¢ = 7/2(n + 1), and consider the behaviour for small ¢. 


Chapter 11 
11.3 By equation (11.19) we see that 


(P+ imwX)r4in/aw = CET) (P + imwX)o = UP + imwX)s, 


from which it follows that mw Xi4n/a = Pt. 
11.4 Show that 


[X(t) — E(X(t))] /é = [P(0) — E (P(0))] /m + [X (0) — E(X(0))} /t- 
11.5 It is easy to show from the commutation relations that 


eitls/h 7 e~iths/h aoa e*L,, 
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and using this in the KMS condition we get 
tr[e~P 14 L_] = eFtr[e“F49 L_ Ly]. 


The differential equation follows on writing LiL- and L_L4 in 
terms of L? and L2. 
11.6 In the last part note that 


“pr + EO) =e (Go +x) — (SEEN), 


2m 2m mM 


11.7 Show that the equation of motion for S implies that T is constant. 
In the last part show that 


1 
E(J2) = E(J3) = 5E(J? + J). 


Chapter 12 
12.2 Note that the first-order correction to the energy level (n + 4) hw is 
XP p|l? 
aE(x4) = Call 
OO) = TIE 
where the wave function is related to the ground state wave function, 
vo, by p = ah yo. 
12.4 One can reduce the work needed by noticing that all the matrix 


elements which appear in the calculation of the first-order energy 
corrections are of the form 


4. f(jnrx\ . f(kry\ . prmxy , sry 
cf evgzsin (HE) sin ( 5 ) sin ( 5 ) sin ( ra ) dady, 


for various values of j, k, r, and s, and these can be written as 
products of integrals of the form 


[eosin (=) sin (=) dz. 


12.6 For the exact answer note that 


BL3+CL, = VB? +C'nL 
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where n is the unit vector 


Cc B 
= SSS 8",  O— ——_—_—— 3 
7 (sz +02?’ JRF + =i) 


and note that, by rotating coordinates, we may take n to (0,0, 1). 

12.8 The first excited state is doubly degenerate so one must first calculate 
the elements of a 2 x 2 matrix, and then find its eigenvalues. The 
integrals giving the matrix entries can be written in terms of integrals 
of the form 


[ 2. (w) debe») 


where 7;, Wx, Yr, and 7, are one-dimensional oscillator wave func- 
tions, with 7 +r =1=k+. Unless r = s, and so also j = k, this 
expression vanishes, so the matrix is already diagonal. One readily 
sees that the diagonal entries are equal and opposite. 
12.9 See the comments about Exercises 12.4 and 12.8. 
12.11 With Hp of the given form we have 


H! =dX* 4 dm(a? ~ w?)X? — , 


and we can choose w and & so that H’ypo is a multiple of w3. 


Chapter 13 
13.2 Recall that 


i e~iP-Xe~ gar? _ (2m /ax)3 7? /2, 
RS 


13.3 Differentiate the hint for 13.2 with respect to a. 
13.5 The exact ground state energy is 


1 


TA 127 43 
2 ae Bl peer anectiae A 
we + shu (1+ 512 Bus or :) 


The second-order Rayleigh—Schrédinger correction gives the terms 
in \ and A?, whilst the second-order Wigner-Brillouin approxima- 
tion also gives the third-order term. (The third-order correction 
could also be obtained in Rayleigh—Schrédinger theory using Corol- 
lary 12.4.2.) 
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Chapter 14 


14.2 The trial functions are not normalized, so one must divide by |||? 
when calculating the expectation values. The integrals which arise 
are all of the form 


k 
[eee dx = (-=) yee dx. 


Assume for the last part that the same formulae are valid when 7 is 
not integral. 

14.3 The stationary value occurs when A = 
gives the bound 


~F/(Eo + h?/2ma?) and 


a” Fp? 

(Eo + fi? /2ma?) 

14.8 The wave functions y, and Xp are (unnormalized) elgenvectors of 
the oscillator Hamiltonian P?/2m+3ma*w?X?, with energies tahw 
and Sahw, respectively. Using the virial theorem, one easily calcu- 


lates the expectations of P?/2m in the state Yo, and of mw? X?, in 
the states ~, and Xw,. Since 


(halX*ya) =Exy, (X?)||Xball? 


and ||Xvall? = (valX*~e) is also related to the expectation of an 
oscillator potential energy, the result is now easily obtained with a 
minimum of calculation. 


Chapter 15 
15.1 Writing the energy as F = 4mw?b* one has 


W= mu | J (b? — x?) dx = 4mw (? sin™)(x/b) + 2/b? — a) 


Since the classical turning polite are +b, the Bohr-Sommerfeld rule 
gives LA W(—b) = (n+ $)mh, from which it follows that E = 
(n + 4)fiw, so that the energy is exact in this case. 
15.2 In the same notation as the previous hint a is proportional to (b? — 
x?)—4, from which a” /a is easily calculated. 


Chapter 16 


16.1 In terms of the rotated coordinates y = (x1 + a) /v3 and z = 
(x1 — r2)/V2, the potential can be expressed as Ky? + 3(K —2)z?, 
whilst the Laplace operator is invariant under yolation Note that 


Eo - 
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the even wave functions represent bosons, and that an harmonic 
oscillator eigenstate ¥, with energy (n + 4)fw is even if and only 
if n is even. (This may be proved from the generating function or 
from the fact that the wave function is a multiple of afo, and a+ 
is odd.) 

16.2 The dimension of the bosonic space is (“*"). The possible energies 
for distinguishable particles are 


Bh 
4 


(G+ 1) — 210 + 1} 


for integral 7 between 0 and 21, whilst for bosons only even values of 
j occur. (In each case the degeneracy of the level is (27 + 1).) 

16.3 For the energies see Exercise 8.12. For distinguishable particles the 
possible energies are the triply degenerate level, Hj; + E;,—, and the 
non-degenerate level, F; + E; + 3x, for each choice of j and k. For 
fermions j and k must be distinct in the case of the levels E; + Ey, —k. 
Compare the lowest levels of each type when 4« > (E, — Eo) > 0. 


Chapter 17 


17.1 After separation of variables the radial equation becomes 


and for normalizability the solutions should be negative exponentials. 
17.3 Show that P commutes with the Hamiltonian as well as J. Notice 
also that (J.P)? = [P|?, so that the possible eigenvalues of J.P are 
related to those of P. : 
17.4 Show that 


Gr (Op) = Pp + ~K, 
17.5 For the last part show that 


ee a) 1(—Po, P) = ¥(Po, P) (2 -) ; 


Chapter 18 


18.3 For a first-order equation, such as the Dirac equation, the appropri- 
ate matching conditions are just that the wave function itself has 
all its components continuous at +a. It is easy to check that the 


CHAPTER 18 373 


given vector-valued wave functions do satisfy the Dirac equation in 
the regions |z3| < @ and 23 > a, under the conditions that 

(Po + V)? — P? = m?c?, and =s- Pa’? +. K2 = m?e?, 
respectively. The continuity of each of the two-component pieces of 
w at a gives the simultaneous equations relating v, v’, and w. For 
bound states we try 


e-*Poto/h .Kes/h ( ee } 
Potme73U 


when x3 < —a. (For general states we should add a second term with 
K replaced by —K and wu by u’.) There are then two more matching 
conditions at —a. Then u and w may be eliminated to give two 
equations linking v and v’. The final equation is the condition for 
these equations to admit non-trivial solutions. 
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